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2  EXECUTIVE  SUMMARY 


Computer-based,  decentralized  decision  making  involving  planning,  scheduling,  and 
resource  allocation  (DPSRA)  problems  is  increasingly  important  as  we  attempt  to  create 
agile  military  and  commercial  organizations  that  can  exploit  the  enormous  amount  of 
information  that  is  available  on-line  and  the  emerging  capability  for  on-line  organizational 
interaction  (e.g.,  enterprise  integration  systems,  the  electronic  marketplace,  etc.).  Examples 
of  DPSRA  problems  include  logistical  resource  scheduling,  crisis  management,  and 
concurrent  engineering.  The  design  of  such  applications  is  fraught  with  difficulties  because 
agents  in  such  systems  cannot  independently  avoid  conflicts,  cannot  access  a  global 
perspective  to  schedule  their  actions,  cannot  easily  search  for  solutions  in  isolation,  cannot 
respond  statically  to  real-time  deadlines,  and  must  cope  with  an  uncertain  and  changing 
environment.  A  major  hurdle  facing  the  construction  of  DPSRA  applications  is  the  lack  of  a 
generic  framework  for  solving  the  difficulties  outlined  above.  This  generic  framework  will 
make  it  possible  to  significantly  speed  up  the  development  of  future  DPSRA  applications. 

Central  to  our  approach  is  the  creation  of  a  Coordinated  Negotiated  Search  (CNS) 
framework — one  tiiat  views  negotiation  and  coordination  as  integral  parts  of  the 
cooperative  search  process  for  a  solution  mutually  acceptable  to  all  agents.  This  framework 
integrates  a  wide  range  of  negotiation  strategies  for  different  situations.  These  strategies  are 
based  on  a  sophisticated  view  of  negotiation  as  a  multi-level,  multi-stage,  and  multi- 
anchored  process  in  which  agents  not  only  exchange  domain  proposals  and  critiques  but 
also  exchange  meta-level  information  about  their  dynamically  evolving  local  and  com^posite 
search  spaces.  These  strategies  do  not  depend  solely  on  centralized  mediation,  or  unlimited 
communication  or  computational  resources,  or  on  agents  having  homogeneous  structures 
or  representations.  DPSRA  systems  will  not  work  on  only  one  problem  at  a  time,  but  rather 
on  a  continually  evolving  set  of  interrelated  problems.  Coordination  strategies  are  based  on 
domain  independent  coordination  relationships  among  tasks.  This  approach  clearly 
delineates  the  coordination  component  of  a  distributed  agent  from  the  agent's  loc^ 
scheduling  mechanisms.  Strategies  for  coordination  (of  problem  solving,  negotiation,  and 
monitoring)  have  been  modeled  and  analyzed  based  on  *e  quantitative  properties  of  these 
coordination  relationships  and  of  other  characteristics  of  the  environment. 

The  specific  applications  we  have  used  to  exemplify  the  coordination  and  negotiation  issues 
of  DPSRA  systems  are  airline  terminal  resource  scheduling  (gates,  fuel  trucks,  bag  trucks, 
etc.),  multi-depot  vehicle  routing,  and  cooperative  agent  design  of  steam  condensers. 
Recent  work,  though  not  yet  completely  implemented,  has  involved  a  cooperative 
information  gathering  application  running  on  the  Internet.  Additionally,  a  sophisticated 
simulation  system,  called  TAEMS,  has  also  been  constructed  for  testing  the  effectiveness 
of  different  coordination  strategies. 


3  SUMMARY  OF  TECHNICAL  RESULTS 

The  following  represent  the  major  technical  accomplishments  of  the  contract: 

•  Development  of  GPGP,  the  first  domain-independent  architecture  for  distributed,  real¬ 
time  agent  coordination. 

A  family  of  generic,  real-time  distributed  coordination  algorithms  for  use  with  cooperative 
agents  has  been  developed,  called  GPGP.  This  family  allows  for  a  wide  range  of 
cooperation  strategies,  tailored  to  the  needs  of  the  specific  application  environment,  to  be 
implemented  within  a  simple  and  extensible  framework.  This  framework  makes  a  clear 
separation  among  the  coordination  module,  the  local  real-time  scheduler  and  the  application 
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program.  This  is  important  because  in  many  real  applications  we  do  not  want  to  replace  the 
existing  application,  but  rather  improve  its  performance  by  applying  coordination 
techniques.  Thus,  these  coordination  mechanisms  must  interact  smoothly  with  existing 
system  components. 

•  Development  of  TAEMS,  a  simulation  system  and  formal  language  for  studying  agent 

coordination  issues. 

As  part  of  the  effort  in  developing  GPGP,  we  have  also  completed  a  formal  framework, 
TAEMS  (Task  Analysis,  Environment  Modeling,  and  Simulation),  for  specifying  task 
environments  and  analyzing  coordination  algorithms  with  respect  to  multiple  performance 
criteria,  and  we  have  implemented  a  simulator.  TAEMS  allows  users  to  do  both  exploratory 
research  into  possible  performance  effects  and  to  verify  analytically  derived  models  of  the 
effects  of  environmental  characteristics  on  coordination  algorithm  performance.  We  used 
this  framework  to  build  a  model  of  a  simplified,  distributed  interpretation  task,  and 
examined  the  question  of  how  various  organizations  of  agents  would  perform  in  this 
environment,  building  an  analytic  model  for  describing  the  performance  of  three 
coordination  algorithms.  It  has  also  been  applied  recently  to  a  problem  involving  NASA 
regarding  how  to  coordinate  its  proposed  distributed  data  analysis  centers. 

•  Development  of  TEAM,  a  reusable  agent  architecture  for  concurrent  engineering  design 

and  its  realization  in  a  sophisticated  design  application  involving  seven  expert  systems. 

To  support  the  integration  of  heterogeneous  and  reusable  agents  into  functional  agent  sets, 
a  multiagent  framework,  TEAM,  has  been  implemented.  Conflict  is  an  integral  part  of 
problem  solving  in  multi-agent  systems  and  is  often  the  focal  point  of  interaction  among 
agents.  Our  work  acknowledges  conflict  as  a  driving  force  in  the  control  of  distributed- 
search  activity.  The  effectiveness  of  this  architecture  has  been  investipted  in  a  seven-agent 
steam  condenser  design  system.  This  system  outperforms  an  existing  mechanical  design 
system  that  exploits  the  same  knowledge  but  is  not  structured  as  a  multiagent  negotiation 
process. 

•  Development  of  DARM,  a  complex  distributed  scheduling  system  involving  the 

scheduling  of  resources  at  an  airport. 

The  DARM  (Distributed  Airport  Resource  Manager)  distributed  scheduling  application  has 
been  used  to  verify  and  extend  earlier  work  by  Sycara  et  al.  on  the  importance  and  role  of 
meta-level  information  in  achieving  efficient  distributed  scheduling.  A  testbed  has  been 
created  that  can  be  configured  as  a  community  of  two  or  more  agents,  each  with  its  own 
resources  (i.e.,  gates,  fuel  trucks,  baggage  handlers)  and  each  responsible  for  satisfying  its 
own  schedule  of  arriving  and  departing  flights.  The  need  for  cooperation  and  negotiation 
arises  because  an  individual  agent  may  lack  sufficient  resources  to  satisfy  its  schedule  and 
may  have  to  borrow  these  resources  from  other  agents.  By  coordinating  the  individual 
scheduling  efforts  so  that  each  agent  understands  the  probable  requirements  of  other  agents, 
the  likelihood  increases  that  remote  agents  will  be  able  to  lend  the  appropriate  resource  at  the 
time  it  is  required.  In  the  event  that  no  globally  satisfactory  solution  can  be  found,  agents 
must  negotiate  in  order  to  determine  which  local  constraints  can  be  relaxed  to  enable  such  a 
solution  to  be  developed.  The  presence  of  meta-level  information  allows  agents  to  more 
accurately  determine  whether  to  solve  subproblems  locally  (through  backtracking  and 
constraint  relaxation)  or  whether  to  apply  to  other  agents  for  resources.  This  system  is  the 
most  sophisticated  distributed  scheduling  application  developed  to  date  and  therefore 
represents  an  important  data  point  in  assessing  the  feasibility  of  a  distributed  scheduling 
approach  for  real  applications.  A  side  benefit  of  this  effort  has  been  the  development  of  the 
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DSS  domain  independent  job  shop  scheduling  system  whose  performance  benchmarks  are 
comparable  or  better  than  existing  systems. 

•  Development  of  a  new  negotiation  protocol  for  electronic  commerce;  the  advantages  of 
this  protocol  over  the  contract-net  protocol  have  been  formally  justified. 

We  have  significantly  extended  the  original  contract-net  negotiation  protocol  for  use  with 
self-interested  agents  involved  in  electronic  commerce.  An  important  aspect  of  this  protocol 
is  that  it  allows  for  a  contract  to  specify  a  unilateral  decommitant  penalty.  We  have  shown 
that  this  capability  improves  expected  social  welfare  and  Pareto  efficiency  of  contracts  by 
allowing  better  accommodation  of  future  events. 

Based  on  these  technical  accomplishments  and  the  verification  of  the  usefulness  of  these 
approaches  both  empirically  through  implementing  them  on  complex  applications  and 
tlnough  formal  analytic  techniques,  we  have  made  significant  progress  in  our  goal  of 
developing  generic  architectures  appropriate  for  distributed  planning,  scheduling,  and 
resource  allocation  (DPSRA)  problems. 
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software  activities. 

•  Experiences  gained  from  the  TEAM  concurrent  engineering  architecture  is  being  applied 
to  Ford  Research  Labs  problems  through  a  contract  with  Blackboard  Technology,  Inc. 
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Abstract 

Many  researchers  have  shown  that  there  is  no  single  best  organization  or  coordination 
mechanism  for  all  environments.  This  paper  discusses  the  design  and  implementation  of  an 
extendable  family  of  coordination  mechanisms,  called  Generalized  Partial  Global  Planning 
(GPGP).  The  set  of  coordination  mechanisms  described  here  assists  in  scheduling  activities  for 
teams  of  cooperative  computational  agents.  The  GPGP  approach  has  several  unique  features. 
First,  it  is  not  tied  to  a  single  domain.  Each  mechanism  is  defined  as  a  response  to  certain 
features  in  the  current  task  environment.  We  show  that  different  combinations  of  mechanisms 
are  appropriate  for  different  task  environments.  Secondly,  the  approach  works  in  conjunction 
with  an  agents  existing  local  planner/scheduler.  Finally,  the  initial  set  of  five  mechanisms 
presented  here  generalizes  and  extends  the  Partial  Global  Planning  (PGP)  algorithm.  In 
comparison  to  PGP,  GPGP  schedules  tasks  with  deadlines,  it  allows  agent  heterogeneity,  it 
exchanges  less  global  information,  and  it  communicates  at  multiple  levels  of  abstraction.  We 
analyze  the  performance  of  several  GPGP  algorithm  family  members  and  one  centralized  upper 
bound  reference  algorithm,  using  data  from  simulations  of  multiple  agent  teams  working  in 
abstract  task  environments.  We  show  how  to  decide  if  adding  a  new  mechanism  is  useful, 
and  suggest  a  way  to  prune  the  search  for  an  appropriate  combination  of  mechanisms  in  an 
environment. 


^  A  shorter  version  of  this  paper  appeared  in  the  Proceedings  of  the  First  International  Conference  on 
Multi-Agent  Systems  (lCMAS-95)^  San  Francisco,  June  1995.  This  work  was  supported  by  DARPA 
contract  N000l4-92'J'l698,  Office  of  Naval  Research  contract  N00014-92-J-1450,  and  NSF  contract 
IRI-9321324.  The  content  of  the  information  does  not  necessarily  reflect  the  position  or  the  policy  of 
the  Government  and  no  official  endorsement  should  be  inferred. 
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Introduction 


This  paper  presents  a  formal  description  of  the  implementation  of  a  domain  independent 
scheduling  coordination  approach  which  we  call  Generalized  Partial  Global  Planning  (GPGP). 
The  GPGP  approach  consists  of  an  extendable  set  of  modular  coordination  mechanisms,  any 
subset  or  all  of  which  can  be  used  in  response  to  a  particular  task  environment.  Each  mechanism 
is  defined  using  our  formal  framework  for  expressing  coordination  problems  (Ti£MS  [8]) .  GPGP 
both  generalizes  and  extends  the  Partial  Global  Planning  (PGP)  algorithm  [10]. 

Our  approach  has  several  unique  features: 

•  Each  mechanism  is  defined  as  a  response  to  certain  features  in  the  current  subjective  task 
environment.  Each  mechanism  can  be  removed  entirely,  or  can  be  parameterized  so  that 
it  is  only  active  for  some  portion  of  an  episode.  New  mechanisms  can  be  defined;  an  initial 
set  of  five  mechanisms  is  examined  that  together  approximate  the  original  PGP  behavior. 
Eventually  we  intend  to  develop  a  library  of  reusable  coordination  mechanisms.  The 
individual  coordination  mechanisms  rest  on  a  shared  substrate  that  arbitrates  between 
the  mechanisms  and  the  agents  local  scheduler  in  a  decision-theoretic  manner. 

•  GPGP  works  in  conjunction  with  an  existing  agent  architecture  and  local  scheduler.  The 
experimental  results  reported  here  were  achieved  using  a  ‘design-to-time^  real-time  local 
scheduler  developed  by  Garvey  [13]. 

•  GPGP,  unlike  PGP,  is  not  tied  to  a  single  domain.  GPGP  allows  more  agent  heterogene¬ 
ity  than  PGP  with  respect  to  agent  capabilities.  GPGP  mechanisms  in  general  exchange 
less  information  than  the  PGP  algorithm,  and  the  information  that  GPGP  mechanisms 
exchange  can  be  at  different  levels  of  abstraction.  PGP  agents  communicated  com¬ 
plete  schedules  at  a  single,  fixed  level  of  abstraction.  GPGP  mechanisms  communicate 
scheduling  commitments  to  particular  tasks,  at  any  convenient  level  of  abstraction. 

The  GPGP  approach  views  coordination  as  modulating  local  control,  not  replacing  it.  This 
process  occurs  via  a  set  of  domain-independent  coordination  mechanisms  that  post  constraints 
to  the  local  scheduler  about  the  importance  of  certain  tasks  and  appropriate  times  for  their 
initiation  and  completion.  An  example  of  a  GPGP  coordination  mechanism  is  the  one  that 
handles  simple  method  redundancy.  If  more  than  one  agent  has  an  otherwise  equivalent 
method  for  accomplishing  a  task,  then  an  agent  that  schedules  such  a  method  will  commit  to 
executing  it,  and  will  notify  the  other  agents  of  its  commitment.  If  more  than  one  agent  should 
happen  to  commit  to  a  redundant  method,  the  mechanism  takes  care  of  retracting  all  but  one 
of  the  redundant  commitments. 

By  concentrating  on  the  creation  of  local  scheduling  constraints,  we  avoid  the  sequentiality 
of  scheduling  in  the  original  PGP  algorithm  that  occurs  when  there  are  multiple  plans.  By 
having  separate  modules  for  coordination  and  local  scheduling,  we  can  also  take  advantage  of 
advances  in  real-time  scheduling  to  produce  cooperative  distributed  problem  solving  systems 
that  respond  to  real-time  deadlines.  We  can  also  take  advantage  of  local  schedulers  that  have 
a  great  deal  of  domain  scheduling  knowledge  already  encoded  within  them.  Finally,  our 
approach  allows  consideration  of  termination  issues  that  were  glossed  over  in  the  PGP  work 
(where  termination  was  handled  by  an  external  oracle).  Nothing  in  TJEMS  the  underlying 
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task  structure  representation,  requires  agents  to  be  cooperative,  antagonistic,  or  simply  self- 
motivated. 

Besides  the  obvious  connections  to  the  earlier  PGP  work,  GPGP  builds  on  work  by 
von  Martial  [20]  in  detecting  and  reacting  to  relationships  (such  as  von  Martials  “favor” 
relationship).  GPGP  also  uses  a  notion  of  social  commitments  similar  to  those  discussed  by 
[2,  19,  1,  15].  Durfee’s  newer  work  [9]  is  based  on  a  hierarchical  behavior  space  representation 
that  like  GPGP  allows  agents  to  communicate  at  multiple  levels  of  detail.  The  mechanisms 
presented  in  this  paper  deal  with  coordination  while  agents  are  scheduling  (locating  in  time) 
their  activities  rather  than  while  they  are  planning  to  meet  goals.  This  allows  them  to  be 
used  in  distributed  scheduling  systems,  agenda-based  systems  (like  blackboard  systems),  or 
systems  where  agents  instantiate  previous  plans  (like  case-based  planning  systems).  The  focus 
on  mechanisms  for  coordinating  schedules  is  thus  slightly  different  from  work  that  focuses 
on  multi-agent  planning  [14,  11].  Shoham  and  Tennenholtz’s  ‘social  laws’  approach  [18]  can 
be  viewed  as  one  which  tries  to  change  the  (perceived)  structure  of  the  tasks  by,  for  example, 
restricting  the  agents’  possible  activities.  Intelligent  agents  might  use  all  of  these  approaches  at 
one  time  or  another. 

The  next  section  will  briefly  re-introduce  our  framework  for  representing  coordination 
problems,  and  summarize  the  assumptions  we  make  about  an  agent’s  internal  architecture. 
We  then  describe  the  GPGP  substrate  and  five  coordination  mechanisms.^  Previous  work  has 
shown  how  the  GPGP  approach  can  duplicate  and  extend  the  behaviors  of  the  PGP  algorithm 
[5];  Section  4  summarizes  several  new  results  that  are  reported  in  [4]  concerning  this  approach’s 
performance,  adaptability,  and  extendibility.  We  conclude  with  a  look  at  our  future  directions. 

1 . 1  Representing  The  Task  Environment 

Coordination  is  the  process  of  managing  interdependencies  between  activities  [17].  If  we  view 
an  agent  as  an  entity  that  has  some  beliefs  about  the  world  and  can  perform  actions,  then  the 
coordination  problem  arises  when  any  or  all  of  the  following  situations  occur:  the  agent  has 
a  choice  of  actions  it  can  take,  and  that  choice  affects  the  agent’s  performance;  the  order  in 
which  actions  are  carried  out  affects  performance;  the  time  at  which  actions  are  carried  out 
affects  performance.  The  coordination  problem  of  choosing  and  temporally  ordering  actions 
is  made  more  complex  because  the  agent  may  only  have  an  incomplete  view  of  the  entire  task 
structure  of  which  its  actions  are  a  part,  the  task  structure  may  be  changing  dynamically,  and 
the  agent  may  be  uncertain  about  the  outcomes  of  its  actions.  If  there  are  multiple  agents  in 
an  environment,  then  when  the  potential  actions  of  one  agent  are  related  to  those  of  another 
agent,  we  call  the  relationship  a  coordination  relationship.  Each  GPGP  coordination  mechanism 
is  a  response  to  some  coordination  relationship. 

The  T/f;MS  framework  (Task  Analysis,  Environment  Modeling,  and  Simulation)  [8]  rep¬ 
resents  coordination  problems  in  a  formal,  domain-independent  way.  We  have  used  it  to 
represent  coordination  problems  in  distributed  sensor  networks,  hospital  patient  scheduling, 
airport  resource  management,  distributed  information  retrieval,  pilot’s  associate,  local  area  net¬ 
work  diagnosis,  etc.  [4],  In  this  paper  we  will  describe  an  agent’s  current  subjective  beliefs 

^These  five  mechanisms  are  oriented  towards  producing  PGP-like  ‘cooperative  team’  behavior.  Mechanisms 
for  self-interested  agents  are  also  possible. 
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about  the  structure  of  the  problem  it  is  trying  to  solve  by  using  the  T^MS  framework  [8,  4]. 
For  this  purpose,  there  are  two  unique  features  of  TvEMS.  The  first  is  the  explicit,  quantitative 
representation  of  task  interrelationships  as  functions  that  describe  the  effect  of  activity  choices 
and  temporal  orderings  on  performance.  The  second  is  the  representation  of  task  structures  at 
multiple  levels  of  abstraction.  The  highest  level  of  abstraction  is  called  a  task  groups  and  contains 
all  tasks  that  have  explicit  computational  interrelationships.  A  task  is  simply  a  set  of  lower-level 
subtasks  and/or  executable  methods.  The  components  of  a  task  have  an  explicitly  defined  effect 
on  the  quality  of  the  encompassing  task.  The  lowest  level  of  abstraction  is  called  an  executable 
method.  An  executable  method  represents  a  schedulable  entity,  such  as  a  blackboard  knowledge 
source  instance,  a  chunk  of  code  and  its  input  data,  or  a  totally-ordered  plan  that  has  been 
recalled  and  instantiated  for  a  task.  A  method  could  also  be  an  instance  of  a  human  activity  at 
some  useful  level  of  detail,  for  example,  “take  an  X-ray  of  patient  Ts  left  foot”. 

A  coordination  problem  instance  (called  an  episode  E)  is  defined  as  a  set  of  task  groups, 
each  with  a  deadline  D(T),  such  as  E  =  (7i,  72,  ^  ,  7^)-  Figure  1  shows  an  objective^  task 

group  and  agent  As  subjective  view  of  that  same  task  group.  A  common  performance  goal 
of  the  agent  or  agents  is  to  maximize  the  sum  of  the  quality  achieved  for  each  task  group 
before  its  deadline.  A  task  group  consists  of  a  set  of  tasks  related  to  one  another  by  a  subtask 
relationship  that  forms  an  acyclic  graph  (here,  a  tree).  Tasks  at  the  leaves  of  the  tree  represent 
executable  methods,  which  are  the  actual  instantiated  computations  or  actions  the  agent  will 
execute  that  produce  some  amount  of  quality  (in  the  figure,  these  are  shown  as  boxes).  The 
circles  higher  up  in  the  tree  represent  various  subtasks  involved  in  the  task  group,  and  indicate 
precisely  how  quality  will  accrue  depending  on  what  methods  are  executed  and  when.  The 
arrows  between  tasks  and/or  methods  indicate  other  task  interrelationships  where  the  execution 
of  some  method  will  have  a  positive  or  negative  effect  on  the  quality  or  duration  of  another 
method.  The  presence  of  these  interrelationships  make  this  an  NP-hard  scheduling  problem; 
further  complicating  factors  for  the  local  scheduler  include  the  fact  that  multiple  agents  are 
executing  related  methods,  that  some  methods  are  redundant  (executable  at  more  than  one 
agent),  and  that  the  subjective  task  structure  may  differ  from  the  real  objective  structure. 

2  Summaiy  of  the  GPGP  algorithm  family  approach 

This  section  will  provide  a  quick  overview  of  the  GPGP  approach.  Figure  2  shows  a  simple 
two-agent  example  that  we  will  use.  Each  agent  has  as  part  of  its  architecture  a  belief  database, 
local  scheduler,  and  coordination  module.  The  local  scheduler  uses  the  information  in  the 
belief  database  to  schedule  method  execution  actions  for  the  agent  in  an  attempt  to  maximize  its 
performance.  We  add  to  this  a  coordination  module  that  is  in  charge  of  communication  actions, 
information  gathering  actions,  and  in  making  and  breaking  commitments  to  complete  tasks  in 
the  task  structure.  The  coordination  module  consists  of  several  coordination  mechanisms,  each 
of  which  notices  certain  features  in  the  task  structures  in  the  belief  database,  and  responds 
by  taking  certain  communication  or  information  gathering  actions,  or  by  proposing  new 
commitments.  The  coordination  mechanisms  rest  in  a  shared  coordination  module  substrate 
that  keeps  track  of  local  commitments  and  commitments  received  from  other  agents,  and  that 
chooses  from  among  multiple  schedules  if  the  local  scheduler  returns  multiple  schedules. 

^The  word  objective’  refers  to  the  fact  that  this  is  the  true,  real  structure. 
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Figure  1 :  Agent  A  and  B  s  subjective  views  (bottom)  of  a  typical  objective  task  group  (top) 


In  these  environments,  the  agents  attempt  to  maximize  the  system-wide  total  utility  (a 
quantity  called  quality’,  described  later)  by  executing  sequences  of  interrelated  ‘methods  .  The 
agents  do  not  initially  have  a  complete  view  of  the  problem  solving  situation,  and  the  execution 
of  a  method  at  one  agent  can  either  positively  or  negatively  affect  the  execution  of  other  methods 
at  other  agents.  We  will  show  examples  of  the  effect  of  the  environment  on  the  performance 
of  a  GPGP  family  member,  and  show  an  environment  where  family  member  A  is  better  than 
B,  and  a  different  environment  where  B  is  better  than  A.  We  will  return  to  the  demonstration 
of  meta-level  information  being  more  useful  when  there  is  a  large  amount  of  variance  between 
episodes  in  an  environment. 

Here  is  a  short  example  intended  only  to  give  the  reader  a  feel  for  the  overall  approach. 
In  Figure  2,  both  agents  have  executed  an  initial  information  gathering  action,  and  have 
their  initial  views  of  the  task  structure  (everything  in  the  agents’  belief  database  except  for  the 
shaded  tasks  (Tasks  2,  5,  D  and  E),  and  the  relationships  touching  the  shaded  tasks).  One 
of  the  coordination  mechanisms  (Mech.  1,  update  non-local  views)  performs  an  information 
gathering  action  to  determine  which  tasks  may  be  related  to  tasks  at  other  agents  (“detect 
coordination  relationships”).  These  tasks  are  then  exchanged  between  the  agents,  resulting  in 
the  belief  databases  shown  in  the  figure  (including  the  shaded  tasks).  Other  mechanisms  react 
to  the  task  structure.  One  mechanism  (Mech.  5,  handle  soft  predecessors)  notices  that  Task  2 
at  Agent  Y  faci  litates  Task  5  at  Agent  X.  In  order  that  Agent  X  might  schedule  to  take  advantage 
of  this,  Agent  Y  s  mechanism  makes  a  local  intermediate  deadline  commitment  to  complete 
its  Task  2  by  time  7  with  minimum  quality  45  (you  and  I  may  infer  that  Y  intends  to  execute 
Method  B,  but  that  local  information  is  not  a  part  of  the  commitment).  A  commitment  is 
made  in  two  stages:  first  it  is  made  locally  to  see  if  it  is  possible  as  far  as  the  agents  local 
scheduler  is  concerned,  and  then  it  is  made  non-locally  and  communicated  to  the  other  agents 
that  are  involved.  Note  that  the  deadline  on  the  non-local  version  of  this  commitment  is  later 
(time  8)  to  take  into  account  the  communication  delay  (here,  1  time  unit).  Similarly,  Agent 
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Figure  2:  An  Overview  of  Generalized  Partial  Global  Planning 


X  has  a  mechanism  (Mech.  3,  handle  simple  redundancy)  that  notices  that  either  agents  X  or 
Y  could  do  Task  4.  Agent  X  does  eventually  commit  to  this  task  (the  process  is  a  bit  more 
complicated  as  will  be  explained  later)  and  communicates  this  commitment  to  Agent  Y. 

In  both  cases  the  agents'  local  schedulers  use  the  information  about  the  task  structure  they 
have  in  their  belief  database,  and  the  local  and  non-local  commitments,  to  construct  schedules. 
The  local  scheduler  may  return  multiple  schedules  for  several  reasons  we  explain  later.  Each 
schedule  is  evaluated  along  the  dimensions  of  the  performance  criteria  (such  as  total  final  quality 
and  termination  time)  and  for  what  (if  any)  local  commitments  are  violated.  If  a  commitment 
is  violated,  the  local  scheduler  may  suggest  an  alternative  (for  instance,  relaxing  a  quality  or 
intermediate  deadline  constraint).  The  coordination  module  chooses  a  schedule  from  this  set, 
and  handles  the  retraction  of  any  violated  commitments. 

2. 1  The  Agent  Architecture 

We  make  few  assumptions  about  the  architecture  of  the  agents.  The  agents  have  a  database  that 
holds  their  current  beliefs  about  the  structure  of  the  tasks  in  the  current  episode;  we  represent 
this  information  using  T^MS.  The  agents  can  do  three  types  of  actions:  they  can  execute 
methods  from  the  task  structure,  send  direct  messages  to  one  another,  and  do  “information 
gathering”.  Information  gathering  actions  model  how  new  task  structures  or  communications 
get  into  the  agents  belief  database.  This  could  be  a  combination  of  external  actions  (checking 
the  agents  incoming  message  box)  and  internal  planning.  Method  execution  actions  cause 
quality  to  accrue  in  a  task  group  (as  indicated  by  the  task  structure).  Communication  actions 
are  used  to  send  the  results  of  method  executions  (which  in  turn  may  trigger  the  effects  of 
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various  task  interrelationships)  or  meta-level  information. 

Formally,  we  write  to  mean  agent  A  subjectively  believes  x  at  time  t  (from 

Sboham[l  9]).  We  will  shorten  this  to  B{x^  when  the  particular  agent  or  time  is  not  important. 
An  agent’s  subjective  beliefs  about  the  current  episode  include  the  agent’s  beliefs  about  task 
groups,  subtasks,  executable  methods,  and  interrelationships  (e.g.,  5(7^  €  Wj,B[Ta,M\,  G 
7^),B(enables(r„,M6))). 

The  GPGP  family  of  coordination  mechanisms  also  makes  a  stronger  assumption  about  the 
agent  architecture.  It  assumes  the  presence  of  a  local  scheduling  mechanism  (to  be  described  in 
the  next  section)  that  can  decide  what  method  execution  actions  should  take  place  and  when. 
The  local  scheduler  attempts  to  maximize  a  (possibly  changing)  utility  function.  The  current 
set  of  GPGP  coordination  mechanisms  are  for  cooperative  teams  of  agents — they  assume  that 
agents  do  not  intentionally  lie  and  that  agents  believe  what  they  are  told.  However,  because 
agents  can  believe  and  communicate  only  subjective  information,  they  may  unwittingly  transmit 
information  that  is  inconsistent  with  an  objective  view  (this  can  cause,  among  other  things, 
the  phenomena  of  distraction).  Finally,  the  GPGP  family  approach  requires  domain-dependent 
code  to  detect  or  predict  the  presence  of  coordination  relationships  in  the  local  task  structure. 
In  this  paper  we  will  refer  to  that  domain-dependent  code  as  the  information  gathering  action 
called  detect-coordination-relationships-,  we  will  describe  this  action  more  in  Section  3.2. 


2.2  The  Local  Scheduler 

Each  GPGP  agent  contains  a  local  scheduler  that  takes  three  types  of  input  information 
and  produces  a  set  of  schedules  and  alternatives.  The  first  input  is  the  current,  subjectively 
believed  task  structure.  Using  information  about  the  potential  duration,  potential  quality,  and 
interrelationships,  the  local  scheduler  chooses  and  orders  executable  methods  in  an  attempt  to 
maximize  a  pre-defined  utility  function.  In  this  paper  the  utility  function  is  the  sum  of  the 
task  group  qualities  DreE  D(T)),  where  Q{T,  t)  denotes  the  quality  of  T  at  time  t  as 
defined  in  [8] .  Quality  does  not  accrue  after  a  task  group  s  deadline. 

The  second  input  is  a  set  of  commitments  C.  These  commitments  are  produced  by  the 
GPGP  coordination  mechanisms,  and  act  as  extra  constraints  on  the  schedules  that  are  produced 
by  the  local  scheduler.  For  example,  if  method  1  is  executable  by  agent  A  and  method  2  is 
executable  by  agent  B,  and  the  methods  are  redundant,  then  one  of  agent  A’s  coordination 
mechanisms  may  commit  agent  A  to  do  method  1.  Commitments  are  social — directed  to 
particular  agents  in  the  sense  of  the  work  of  Shoham  and  Castelfranchi  [1,  19]).  A  local 
commitment  C  by  agent  A  becomes  a  non-local  commitment  when  received  by  another  agent 
B.  This  paper  will  use  two  types  of  commitments:  (7(Do(T',  5))  is  a  commitment  to  ‘do’ 
(achieve  quality  for)  T  and  is  satisfied  at  the  time  t  when  Q{T,t)  >  q;  the  second  type 
C'(DL(r,  q,  tcu))  is  a  ‘deadline’  commitment  to  do  T  by  time  and  is  satisfied  at  the  time  t 
when  [Q{T,  f)  >  9]  A  [f  <  tdi].  When  a  commitment  is  sent  to  another  agent,  it  also  implies 
that  the  task  result  will  be  communicated  to  the  other  agent  (by  the  deadline,  if  it  is  a  deadline 
commitment). 

The  third  input  to  the  local  scheduler  is  the  set  of  non-local  commitments  NLC  made 
by  other  agents.  This  information  can  be  used  by  the  local  scheduler  to  coordinate  actions 
between  agents.  For  example  the  local  scheduler  could  have  the  property  that,  if  method  Mi 
is  executable  by  agent  A  and  is  the  only  method  that  enables  method  M2  at  agent  B  (and 
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agent  B  knows  this),  and  5^((7(DL(Mi,  g,  fi)))  G  5b(NLC),  then  for  every  schedule  S 
produced  by  agent  B,  (M2,  t)  G  ^  t  >  ti  {in  other  words,  agent  B  only  schedules  the 
enabled  method  after  the  deadline  that  agent  A  has  committed  to. 

A  schedule  5  produced  by  a  local  scheduler  will  consist  of  a  set  of  methods  and  start 
times:  S  —  {(Mi,  ii),  (M2,  t2)>  •  •  •  ?  (M^,  fn)}-  The  schedule  may  include  idle  time,  and  the 
local  scheduler  may  produce  more  than  one  schedule  upon  each  invocation  in  the  situation 
where  not  all  commitments  can  be  met.  The  different  schedules  represent  different  ways 
of  partially  satisfying  the  set  of  commitments.  The  function  Violated(*9)  returns  the  set 
of  commitments  that  are  believed  to  be  violated  by  the  schedule.  For  violated  deadline 
commitments  C{DL{T,  g,  tdi))  G  Violated(5)  the  function  Alt((7,  S)  returns  an  alternative 
commitment  (7(DL(r,  g,  where  =  min  t  such  that  Q{Tj  t)  >  q  i( such  a  t  exists,  or 
NIL  otherwise.  For  a  violated  Do  commitment  an  alternative  may  contain  a  lower  minimum 
quality,  or  no  alternative  may  be  possible.  The  function  C/cst(E,  5,  NLC)  returns  the  estimated 
utility  at  the  end  of  the  episode  if  the  agent  follows  schedule  S  and  all  non-local  commitments 
in  NLC  are  kept. 

Thus  we  may  define  the  local  scheduler  as  a  function  LS(E,C,NLC)  returning  a  set 
of  schedules  S  =  {*?!,  1S2, .  • . ,  Sm}*  More  detailed  information  about  this  kind  of  interface 
between  the  local  scheduler  and  the  coordination  component  may  be  found  in  [12].  This 
is  an  extremely  general  definition  of  the  local  scheduler,  and  is  the  minimal  one  necessary 
for  the  GPGP  coordination  module.  Stronger  definitions  than  this  will  be  needed  for  more 
predictable  performance,  as  we  will  discuss  later.  Ideally,  the  optimal  local  scheduler  would 
find  both  the  schedule  with  maximum  utility  Su  and  the  schedule  with  maximum  utility 
that  violates  no  commitments  Sy.  In  practice,  however,  a  heuristic  local  scheduler  will 
produce  a  set  of  schedules  where  the  schedule  of  highest  utility  Su  is  not  necessarily  optimal: 
U(E,  NLC)  <  U(E,  5^,  NLC). 

3  Five  GPGP  Coordination  Mechanisms 

The  role  of  the  coordination  mechanisms  is  to  provide  information  to  the  local  scheduler  that 
allows  the  local  scheduler  to  construct  better  schedules.  This  information  can  be  in  the  form 
of  modifications  to  portions  of  the  subjective  task  structure  of  the  episode  or  in  the  form  of 
local  and  non-local  commitments  to  tasks  in  the  task  structure.  The  five  mechanisms  we  will 
describe  in  this  paper  form  a  basic  set  that  provides  similar  functionality  to  the  original  Partial 
Global  Planning  algorithm  as  shown  in  [5].  Mechanism  1  exchanges  useful  private  views  of 
task  structures;  Mechanism  2  communicates  results;  Mechanism  3  handles  redundant  methods; 
Mechanisms  4  and  5  handle  hard  and  soft  coordination  relationships.  More  mechanisms  can 
be  added,  such  as  one  to  update  utilities  across  agents  as  discussed  in  the  next  section,  or  to 
balance  the  load  better  between  agents.  The  mechanisms  are  independent  in  the  sense  that  they 
can  be  used  in  any  combination.  If  inconsistent  constraints  are  introduced,  the  local  scheduler 
will  return  at  least  one  violated  constraint  in  all  its  schedules.  Since  the  local  scheduler  typically 
satisfices  instead  of  optimizes,  it  may  do  this  even  if  constraints  are  not  inconsistent  (i.e.  it 
does  not  search  exhaustively).  The  next  section  describes  how  a  schedule  is  chosen  by  the 
coordination  module  substrate. 
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3.1  The  GPGP  Coordination  Module  Substrate 

All  the  specific  coordination  mechanisms  rest  on  a  common  substrate  that  handles  information 
gathering  actions,  invoking  the  local  scheduler,  choosing  a  schedule  to  execute  (including 
dealing  with  violated  or  inconsistent  commitments) ,  and  deciding  when  to  terminate  processing 
on  a  task  group.  Information  gathering  actions  include  noticing  new  task  group  arrivals  and 
receiving  communications  from  other  agents.  Information  gathering  is  done  at  the  start  of 
problem  solving,  when  communications  are  expected  from  other  agents,  and  when  the  agent 
is  otherwise  idle.  Communications  are  expected  in  response  to  certain  events  (such  as  after 
the  arrival  of  a  new  task  group)  or  as  indicated  in  the  set  of  non-local  commitments  NLC. 
This  is  the  minimal  general  information  gathering  policy.  Termination  of  processing  on  a  task 
group  occurs  for  an  agent  when  the  agent  is  idle,  has  no  expected  communications,  and  no 
outstanding  commitments  for  the  task  group. 

Choosing  a  schedule  is  more  complicated.  The  agents  local  scheduler  may  return  multiple 
schedules  because  it  cannot  find  a  single  schedule  that  both  maximizes  utility  and  meets  all 
commitments.  From  the  set  of  schedules  S  returned  by  the  local  scheduler,  two  particular 
schedules  are  identified:  the  schedule  with  the  highest  utility  Su  and  the  best  committed 
schedule  Sc^  If  they  are  the  same,  then  that  schedule  is  chosen.  Otherwise,  we  examine 
the  sum  of  the  changes  in  utility  for  each  commitment.  Each  commitment,  when  created,  is 
assigned  the  estimated  utility  Uest  ft>t  the  task  group  of  which  it  is  a  part.  This  utility  may  be 
updated  over  time  (when  other  agents  depend  on  the  commitment,  for  example).  We  then 
choose  the  schedule  with  the  largest  positive  change  in  utility.  This  allows  us  to  abandon 
commitments  if  doing  so  will  result  in  higher  overall  utility.  The  coordination  substrate  does 
not  use  the  local  scheduler  s  utility  estimate  C/est  directly  on  the  entire  schedule  because  it  is 
based  only  on  a  local  view.  The  coordination  substrate  may  receive  non-local  information  that 
places  a  higher  utility  on  a  commitment  than  it  has  locally. 

For  example,  at  time  t  agent  A  may  make  a  commitment  (7i  on  task  T  G  7i  G  E  that 
results  in  a  schedule  Si.  Ci  initially  acquires  the  estimated  utility  of  the  task  group  of  which 
it  is  a  part,  U{Ci)  ^  i7est({7^},  ^^(NLC)).  Let  C/((7i)  =  50.  After  communicating 

this  commitment  to  agent  B  (making  it  part  of  ^^(NLC),  agent  B  uses  the  commitment 
to  improve  C/est ({7i },  5'2,  (NLC))  to  100.  A  coordination  mechanism  can  detect  this 
discrepancy  and  communicate  the  utility  increase  back  to  agent  A,  so  that  when  agent  A 
considers  discarding  the  commitment,  the  coordination  substrate  recognizes  the  non-local 
utility  of  the  commitment  is  greater  than  the  local  utility. 

If  both  schedules  have  the  same  utility,  the  one  that  is  more  negotiable  is  chosen.  Every 
commitment  has  a  negotiability  index  (high,  medium,  or  low)  that  indicates  (heuristically)  the 
difficulty  in  rescheduling  if  the  commitment  is  broken.  This  index  is  set  by  the  individual 
coordination  mechanisms.  For  example,  hard  coordination  relationships  like  enables  that 
cannot  be  ignored  will  trigger  commitments  with  low  negotiability.  If  the  schedules  are  still 
equivalent,  the  shorter  one  is  chosen,  and  if  they  are  the  same  length,  one  is  chosen  at  random. 

After  a  schedule  S  is  chosen,  if  Violated(5)  is  not  empty,  then  each  commitment  C  G 
Violated(*S')  is  replaced  with  its  alternative  C  C\C  \J  Alt((7j  S).  If  the  commitment  was 
made  to  other  agents,  the  other  agents  are  also  informed  of  the  change  in  the  commitment. 
While  this  could  potentially  cause  cascading  changes  in  the  schedules  of  multiple  agents,  it 
generally  does  not  for  three  reasons:  first,  as  we  mentioned  in  the  previous  paragraph  less 
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important  commitments  are  broken  first;  secondly,  the  resiliancy  of  the  local  schedulers  to 
solve  problems  in  multiple  ways  tends  to  damp  out  these  fluctuations;  and  third,  agents  are 
time  cognizant  resource-bounded  reasoners  that  interleave  execution  and  scheduling  (i.e.,  the 
agents  cannot  spend  all  day  arguing  over  scheduling  details  and  still  meet  their  deadlines).  We 
have  observed  this  useful  phenomenon  before  [4]  and  plan  to  analyze  it  in  future  work. 

3.2  Mechanism  1:  Updating  Non-Local  Viewpoints 

Remember  that  each  agent  has  only  a  partial,  subjective  view  of  the  current  episode.  The 
GPGP  mechanism  described  here  can  communicate  no  private  information  (none’  policy,  no 
non-local  view),  or  all  of  it  (‘all’  policy,  global  view),  or  take  an  intermediate  approach  (‘some’ 
policy,  partial  view) .  The  process  of  detecting  coordination  relationships  between  private  and 
shared  parts  of  a  task  structure  is  in  general  very  domain  specific,  so  we  model  this  process 
by  a  new  information  gathering  action,  detect-coordination-relationships,  that  takes  some  fixed 
amount  of  the  agent’s  time.  This  action  is  scheduled  whenever  a  new  task  group  arrives. 

The  set  P  of  privately  believed  tasks  or  methods  at  an  agent  A  (tasks  believed  at  arrival  time 
by  A  only)  is  then  {x  \  task{x)  A  Va  G  A  \  A,  -i5^(5^’’(®)(a5))},  where  A  is  the  set  of  all 
agents  and  Ar(a;)  is  the  arrival  time  of  x.  Given  this  definition,  the  action  detect-coordination- 
relationships  returns  the  set  of  private  coordination  relationships  PCR  =  {r-  I  Ti  G  P  A  Tz  0 
P  A  [t‘(T'i,  Tz)  V  r(Tz,  Ti)]}  between  private  and  mutually  believed  tasks.  The  action  does 
not  return  what  the  task  Tz  is,  just  that  a  relationship  exists  between  Ti  and  some  otherwise 
unknown  task  Tz.  For  example,  in  the  DVMT,  we  have  used  the  physical  organization  of  agents 
to  detect  that  Agent  A’s  task  Ti  in  an  overlapping  sensor  area  is  in  fact  related  to  some  unknown 
task  Tz  at  agent  B  (i.e.  Ba{Bb{T2)))  [5].  The  non-local  view  coordination  mechanism 

then  communicates  these  coordination  relationships,  the  private  tasks,  and  their  context;  if 
7*(Ti,  Tz)  G  PCR  and  Ti  G  P  then  r  and  Ti  will  be  communicated  by  agent  A  to  the  set  of 
agents  {a  |  S^(Ro(Tz))}. 


Figure  3:  Agents  A  and  B  s  local  views  after  receiving  non-local  viewpoint  communications  via  mechanism  1 
(shaded  objects).  Figure  1  shows  the  agents  initial  states. 

For  example,  Figure  3  shows  the  local  subjective  beliefs  of  agents  A  and  B  after  the 
communication  from  one  another  due  to  this  mechanism. 
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The  agents’  initial  local  view  was  shown  previously  in  Figure  1 .  In  this  example,  Tz  and  T4 
are  two  elements  in  Agent  B  s  private  set  of  tasks  P,  facilitates(T4,  Ti,  ^q)  £  PCR  (the 
facilitation  relates  a  private  task  to  a  mutually  believed  task),  and  enables(r4,  T3)  is  completely 
local  to  Agent  B  (it  relates  two  private  tasks).  At  the  start  of  this  section  we  mentioned  that 
coordination  relationships  exist  between  portions  of  the  task  structure  controllable  by  different 
agents  (i.e.,  in  PCR)  and  within  portions  controllable  by  multiple  agents.  We’ll  denote  the 
complete  set  of  coordination  relationships  as  CR;  this  includes  all  the  elements  of  PCR  and 
all  the  relationships  between  non-private  tasks.  Some  relationships  are  entirely  local — between 
private  tasks — and  are  only  of  concern  to  the  local  scheduler.  The  purpose  of  this  coordination 
mechanism  is  the  exchange  of  information  that  expands  the  set  of  coordination  relationships 
CR.  Without  this  mechanism  in  place,  CR  will  consist  of  only  non-private  relationships, 
and  none  that  are  in  PCR.  Since  the  primary  focus  of  the  coordination  mechanisms  is  the 
creation  of  social  commitments  in  response  to  coordination  relationships  (elements  of  CR), 
this  mechanism  can  have  significant  indirect  benefits.  In  environments  where  |PCR|  tends 
to  be  small,  very  expensive  to  compute,  or  not  useful  for  making  commitments  (see  the  later 
sections),  this  mechanism  can  be  sucessfully  omitted. 

3.3  Mechanism  2:  Communicating  Results 

The  result  communication  coordination  mechanism  has  three  possible  policies:  communicate 
only  the  results  necessary  to  satisfy  commitments  to  other  agents  (the  minimal  policy);  com¬ 
municate  this  information  plus  the  final  results  associated  with  a  task  group  ('TG’  policy), 
and  communicate  all  results  (all’  policy^).  Extra  result  communications  are  broadcast  to  all 
agents,  the  minimal  commitment-satisfying  communications  are  sent  only  to  those  agents  to 
whom  the  commitment  was  made  (i.e.,  communicate  the  result  of  T  to  the  set  of  agents 
{A  E  A  I  B{B4C{T))}. 

3.4  Mechanism  3:  Handling  Simple  Redundancy 

Potential  redundancy  in  the  efforts  of  multiple  agents  can  occur  in  several  places  in  a  task 
structure.  Any  task  that  uses  a  ‘max’  quality  accumulation  function  (one  possible  semantics 
for  an  ‘OR’  node)  indicates  that,  in  the  absence  of  other  relationships,  only  one  subtask  needs 
to  be  done.  When  such  subtasks  are  complex  and  involve  many  agents,  the  coordination  of 
these  agents  to  avoid  redundant  processing  can  also  be  complex;  we  will  not  address  the  general 
redundancy  avoidance  problem  in  this  paper  (see  instead  [16]).  In  the  original  PGP  algorithm 
and  domain  (distributed  sensor  interpretation),  the  primary  form  of  potential  redundancy 
was  simple  method  redundancy — the  same  result  could  be  derived  from  the  data  from  any  of 
a  number  of  sensors.  The  coordination  mechanism  described  here  is  meant  to  address  this 
simpler  form  of  potential  redundancy. 

The  idea  behind  the  simple  redundancy  coordination  mechanism  is  that  when  more  than 
one  agent  wants  to  execute  a  redundant  method,  one  agent  is  randomly  chosen  to  execute 
it  and  send  the  results  to  the  other  interested  agents.  This  is  a  generalization  of  the  ‘static’ 
organization  algorithm  discussed  by  Decker  and  Lesser  [6] — it  does  not  try  to  load  balance,  and 
uses  one  communication  action  (because  in  the  general  case  the  agents  do  not  know  beforehand, 

^Such  a  policy  is  all  that  is  needed  in  many  simple  environments. 
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without  communication,  that  certain  methods  are  redundant'*).  The  mechanism  considers  the 
set  of  potential  redundancies  RCR  =  {r  £  CR  |  [r  =  subtask(T,  M,  min)]  A  \\/M  G 
M,  method{M)]}.  Then  for  all  methods  in  the  current  schedule  S  at  time  t,  if  the  method  is 
potentially  redundant  then  commit  to  it  and  send  the  commitment  to  Others(M)  (non-local 
agents  who  also  have  a  method  in  M): 

[{M,tM)  £  S]  A  [subtask(T,  M,min)  6  RCR]  A  [M  €  M] 

[C(Do(M,  Qest(M,  D(M),  S)))  G  C]  A  [comm(M,  Others(M),t)  G  I] 

See  for  example  the  top  of  figure  4 — both  agents  commit  to  Do  their  methods  for  Ti. 

After  the  commitment  is  made,  the  agent  must  refrain  from  executing  the  method  in 
question  if  possible  until  any  non-local  commitments  that  were  made  simultaneously  can  arrive 
(the  communication  delay  time  5).  This  mechanism  then  watches  for  multiple  commitments 
in  the  redundant  set  and  if  they  appear,  a  unique  agent  is  chosen  randomly  (but  identically 
by  all  agents)  from  those  with  the  best  commitments  to  keep  its  commitment.  All  the  other 
agents  can  retract  their  commitments.  For  example  the  bottom  of  figure  4  shows  the  situation 
after  Agent  B  has  retracted  its  commitment  to  Do  Ri.  If  all  agents  follow  the  same  algorithm, 
and  communication  channels  are  assumed  to  be  reliable,  then  no  second  message  (retraction) 
actually  needs  to  be  sent  (because  they  all  choose  the  same  agent  to  do  the  redundant  method). 
In  the  implementation  described  later,  identical  random  choices  are  made  by  giving  each 
method  a  unique  random  identifier,  and  then  all  agents  choose  the  method  with  the  ‘smallest’ 
identifier  for  execution. 

Initially,  all  Do  commitments  initiated  by  the  redundant  coordination  mechanism  are 
marked  highly  negotiable.  When  a  redundant  commitment  is  discovered,  the  negotiability  of 
the  remaining  commitment  is  lowered  to  medium  to  indicate  the  commitment  is  somewhat 
more  important. 

3.5  Mechanism  4:  Handling  Hard  Coordination  Relationships 

Hard  coordination  relationships  include  relationships  like  enables(Mi,  M2)  that  indicate  that 
Ml  must  be  executed  before  M2  in  order  to  obtain  quality  for  M2  ■  Like  redundant  methods, 
hard  coordination  relationships  can  be  culled  from  the  set  CR.  The  hard  coordination 
mechanism  further  distinguishes  the  direction  of  the  relationship — the  current  implementation 
only  creates  commitments  on  the  predecessors  of  the  enables  relationship.  We’ll  let  HPCR  C 
CR  indicate  the  set  of  potential  hard  predecessor  coordination  relationships.  The  hard 
coordination  mechanism  then  looks  for  situations  where  the  current  schedule  S  at  time  t  will 
produce  quality  for  a  predecessor  in  HPCR,  and  commits  to  its  execution  by  a  certain  deadline 
both  locally  and  socially: 

[Q„t(r,D(T),5)  >  0]  A  [enables(T,M)  G  HPCR]  => 

[C'(DL(T,Qest(T,D(T),5),<eariy))  e  C]  A  [comm(C,Others(M),i)GT] 

The  next  question  is,  by  what  time  (<eariy  above)  do  we  commit  to  providing  the  answer? 
One  solution,  usable  with  any  local  scheduler  that  fits  our  general  description  in  Section  2.2, 

^The  detection  of  redundant  methods  is  domain-dependent,  as  discussed  earlier.  Since  we  are  talking  here 
about  simple,  direct  redundancy  (i.e.  doing  the  exact  same  method  at  more  than  one  agent)  this  detection  is  very 
straight-forward. 
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Commitments  made  from  A  to  B: 

Schedules: 

DO(A1,100)  [Mech  #3] 

1  B4  1  B1  i  B3  i 

Commitments  made  from  B  to  A: 

1  A1  1 

DL(T4,50,5)  [Mech  #5] 

Do(Bl,100)  [Mech #3] 

t=2  t^5  t= 

B3 

B4 

5 

4 

100 

50 

Agent  A 's  view  after  communication  from  B 


Agent  B 's  view  after  communication  from  A 


Agent  A  j  view  after  recieveing  B's  commitments 


Agent  B ’s  view  after  receiving  A 's  commitments 


Figure  4:  A  continuation  of  Figures  1  and  2.  At  top:  agents  A  and  B  propose  certain  commitments  to  one  another 
via  mechanisms  3  and  5.  At  bottom:  after  receiving  the  initial  commitments,  mechanism  3  removes  agent  B  s 
redundant  commitment. 

is  to  use  the  mini  such  that  Qcst{TjT)[T)^  S)  >  0.  In  our  implementation,  the  local 
scheduler  provides  a  query  facility  that  allows  us  to  propose  a  commitment  to  satisfy  as  ‘early 
as  possible  (thus  allowing  the  agent  on  the  other  end  of  the  relationship  more  slack).  We  take 
advantage  of  this  ability  in  the  hard  coordination  mechanism  by  adding  the  new  commitment 
(7(DL(r,  Qest(r,  D(r)j  5)j  "early"))  to  the  local  commitment  set  C,  and  invoking  the  local 
scheduler  LS(E^  C,NLC)  to  produce  a  new  set  of  schedules  S.  If  the  preferred,  highest 
utility  schedule  5c;  G  S  has  no  violations  (highly  likely  since  the  local  scheduler  can  simply 
return  the  same  schedule  if  no  better  one  can  be  found),  we  replace  the  current  schedule  with 
it  and  use  the  new  schedule,  with  a  potentially  earlier  finish  time  for  T,  to  provide  a  value  for 
Nearly  The  new  completed  commitment  is  entered  locally  (with  low  negotiability)  and  sent  to 
the  subset  of  interested  other  agents. 

If  redundant  commitments  are  made  to  the  same  task,  the  earliest  commitment  made  by 
any  agent  is  kept,  then  the  agent  committing  to  the  highest  quality,  and  any  remaining  ties  are 
broken  by  the  same  method  as  before. 

Currently,  the  hard  coordination  mechanism  is  a  pro-active  mechanism,  providing  infor- 
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mation  that  might  be  used  by  other  agents  to  them,  while  not  putting  the  individual  agent  to 
any  extra  effort.  Other  future  coordination  mechanisms  might  be  added  to  the  family  that  are 
reactive  and  request  from  other  agents  that  certain  tasks  be  done  by  certain  times;  this  is  quite 
different  behavior  that  would  need  to  be  analyzed  separately. 

3.6  Mechanism  5:  Handling  Soft  Coordination  Relationships 

Soft  coordination  relationships  are  handled  analogously  to  hard  coordination  relationships  ex¬ 
cept  that  they  start  out  with  high  negotiability.  In  the  current  implementation  the  predecessor 
of  a  facilitates  relationship  is  the  only  one  that  triggers  commitments  across  agents,  although 
hinders  relationships  are  present.  The  positive  relationship  facilitates(Mi,  M2,  indi¬ 

cates  that  executing  Mi  before  M2  decreases  the  duration  of  M2  by  a  power’  factor  related  to 
(j)d  and  increases  the  maximum  quality  possible  by  a  ‘power’  factor  related  to  (j)q  (see  [8]  for 
the  details) .  A  more  situation-specific  version  of  this  coordination  mechanism  might  ignore 
relationships  with  very  low  ‘power’.  The  relationship  hinders(Mi,  M2, (j>q)  is  negative  and 
indicates  an  increase  in  the  duration  of  M2  and  a  decrease  in  maximum  possible  quality.  A 
coordination  mechanism  could  be  designed  for  hinders  (and  similar  negative  relationships) 
and  added  to  the  family.  To  be  pro-active  like  the  existing  mechanisms,  a  hinders  mechanism 
would  work  from  the  successors  of  the  relationship,  try  to  schedule  them  late,  and  commit  to 
an  earliest  start  time  on  the  successor.  Figure  4  shows  Agent  B  making  a  D  commitment  to  do 
method  B4,  which  in  turn  allows  Agent  A  to  take  advantage  of  the  facilitates(r4,  Tl,  0.5,  0.5) 
relationship,  causing  method  Ax  to  take  only  half  the  time  and  produce  1.5  times  the  quality. 

4  Experimental  Results 

We  do  not  believe  that  any  of  the  mechanisms  that  collectively  form  the  GPGP  family  of 
coordination  algorithms  are  indispensable.  What  we  can  do  is  evaluate  the  mechanisms  on 
the  terms  of  their  costs  and  benefits  to  cooperative  problem  solving  both  analytically  and 
experimentally.  This  analysis  and  experimentation  takes  place  with  respect  to  a  very  general 
task  environment  that  does  not  correspond  to  a  particular  domain.  Doing  this  produces 
general  results,  but  weaker  than  would  be  possible  to  derive  in  a  single  fixed  domain  because 
the  performance  variance  between  problem  episodes  will  be  far  greater  than  the  performance 
variance  of  the  different  algorithms  within  a  single  episode.  Still,  this  allows  us  to  determine 
broad  characteristics  of  the  algorithm  family  that  can  be  used  to  reduce  the  search  for  a 
particular  set  of  mechanism  parameters  for  a  particular  domain  (with  or  without  machine 
learning  techniques;  see  Section  5).  We  will  also  discuss  statistical  techniques  (e.g.  paired- 
response  simulations)  to  deal  with  the  large  between-episode  variances  that  occur  when  using 
randomly-generated  problems. 

4,1  GPGP  Simulation:  Issues 

Our  model  of  an  abstract  task  environment,  used  in  these  experiments,  has  ten  parameters; 
Table  1  lists  them  and  the  values  used  in  the  experiments  described  in  the  next  two  sections.^ 

^Our  earlier  work  focussed  on  the  analysis  of  distributed  sensor  network  task  environments  [6,  7]. 
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Figure  2  shows  a  small  example  task  group. 


Parameter 

Values  (fecilitation  exps.) 

Values  (clustering  exps.) 

Mean  Branching  factor  (Poisson) 

1 

1 

Mean  Depth  (Poisson) 

3 

3 

Mean  Duration  (exponential) 

10 

(1  10  100) 

Redundant  Method  QAF 

Max 

Max 

Number  of  task  groups 

2 

(1  5  10) 

Task  QAF  distribution 

(20%/80%  min/max) 

(50%/50%  min/max) 

(100%/0%  min/max) 

Hard  CR  distribution 

(10%/90%  enables/none) 

(0%/ 1 00%  enables/none) 

(50%/50%  enables/none) 

Soft  CR  distribution 

(80%/ 1 0%/ 1 0%  facilitates/hinders/ none) 

(0%/ 1 0%/90%  facilitates/hinders/none) 
(50%/10%/40%  facilitates/hinders/none) 

Chance  of  overlaps  (binomial) 

10% 

(0%  50%  100%) 

Facilitation  Strength 

.1  .5  .9 

.5 

Table  1 :  Environmental  Parameters  used  to  generate  the  random  episodes 

The  primary  sources  of  overhead  associated  with  the  coordination  mechanisms  include 
action  executions  (communication  and  information  gathering),  calls  to  the  local  scheduler, 
and  any  algorithmic  overhead  associated  with  the  mechanism  itself  Table  2  summarizes  the 
total  amount  of  overhead  from  each  source  for  each  coordination  mechanism  setting  and  the 
coordination  substrate.  L  represents  the  length  of  processing  (time  before  termination),  and 
d  is  a  general  density  measure  of  coordination  relationships.  We  believe  that  all  of  these 
amounts  can  be  derived  from  the  environmental  parameters  in  Table  1,  they  can  also  be 
measured  experimentally.  Interactions  between  the  presence  of  coordination  mechanisms  and 
these  quantities  include:  the  number  of  methods  or  tasks  in  E,  which  depends  on  the  non¬ 
local  view  mechanism;  the  number  of  coordination  relationships  |CR|  or  the  subsets  RCR 
(redundant  coordination  relationships),  HPCR  (hard  predecessor  coordination  relationships), 
SPCR  (soft  predecessor  coordination  relationships),  which  depends  on  the  number  of  tasks 
and  methods  as  well;  and  the  number  of  commitments  |C|,  which  depends  on  each  of  the 
three  mechanisms  that  makes  commitments. 


Mechanism  setting 

Communications 

Information  Gathering 

Scheduler 

Other  Overhead 

substrate 
nlv  none 

some 

all 

comm  min 
TG 

all 

redundant  on 
hard  on 
soft  on 

0 

0 

0(dP) 

0(P) 

0(C) 

0(C  +  E) 

0(M  6  E) 
O(RCR) 
O(HPCR) 
O(SPCR) 

Ed" 

0 

Edetect-CRs 

E  detect- CRs 

0 

0 

0 

0 

0 

0 

L 

0 

0 

0 

0 

0 

0 

0 

O(HPCR) 

O(SPCR) 

0(LC) 

0 

0(T  6  E) 

0{T  G  E) 

0(C) 

0(C  +  E) 

0(M  G  E) 

0(RCR  ^  S  CR) 
0(HPCRx=5+CR) 
0(SPCR*5  +  CR) 

Table  2:  Overhead  associated  with  individual  mechanisms  at  each  parameter  setting 
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4.2  General  Performance  Issues 


We  examined  the  general  performance  of  the  most  complex  (all  mechanisms  in  place)  and 
least  complex  (all  mechanisms  off)  members  of  the  GPGP  family  in  comparison  to  each  other, 
and  in  comparison  to  a  centralized  scheduler  reference  implementation  (as  an  upper  bound). 
We  looked  at  performance  measures  such  as  the  total  final  quality  achieved  by  the  system, 
the  amount  of  work  done,  the  number  of  deadlines  missed,  and  the  termination  time.  The 
centralized  schedule  reference  system  is  not  an  appropriate  solution  to  the  general  coordination 
problem,  even  for  cooperative  groups  of  agents,  for  several  reasons: 

•  The  centralized  scheduling  agent  becomes  a  possible  single  point  of  failure  that  can  cause 
the  entire  system  to  fail  (unlike  the  decentralized  GPGP  system). 

•  The  centralized  scheduling  agent  requires  a  complete,  global  view  of  the  episode — a  view 
that  we  mentioned  earlier  is  not  always  easy  to  achieve.  We  do  not  account  for  any  costs 
in  building  such  a  global  view  in  the  reference  implementation  (viewing  it  as  an  upper 
bound  on  performance).  We  do  not  allow  dynamic  changes  in  the  episodic  task  structure 
(which  might  require  rescheduling). 

•  The  centralized  reference  scheduler  uses  an  optimal  single-agent  schedule  as  a  starting 
point.  The  problem  of  scheduling  actions  in  even  fairly  simple  task  structures  is  in  NP, 
and  the  optimal  scheduler  s  performance  grows  exponentially  worse  with  the  number  of 
methods  to  be  scheduled.  Since  the  centralized  reference  scheduler  has  a  global  view  and 
schedules  all  actions  at  all  agents,  the  size  of  the  centralized  problem  always  grows  faster 
than  the  size  of  the  scheduling  problems  at  GPGP  agents  with  only  partial  views  and 
heuristic  schedulers. 

We  conducted  300  paired  response  experiments,  using  the  three  algorithms.  “Balanced” 
refers  to  all  mechanisms  being  on,  with  partial  non-local  views  and  communication  of  com¬ 
mitted  results  and  completed  task  groups.  “Simple”  refers  to  all  mechanisms  being  off,  with  no 
non-local  view  and  broadcast  communication  of  all  results.  “Parallel”  refers  to  the  centralized 
reference  scheduler  that  uses  a  heuristic  parallelization  of  an  optimal  single  agent  schedule  using 
a  complete  global  view.  The  experiments  were  based  on  the  same  environmental  parameters 
as  the  facilitation  experiments  (Table  1).  There  are  several  important  things  to  note  about  this 
class  of  environments: 

•  The  size  of  the  episodes  was  kept  artificially  small  so  that  the  centralized  reference 
scheduler  could  find  an  optimal  schedule  in  a  reasonable  amount  of  run  time. 

•  The  experiments  had  very  low  (10%)numbers  of  enables  relationships  and  a  low  (20%) 
number  of  MIN  quality  accrual  functions  because  they  penalize  the  simple  algorithm — 
we  demonstrate  this  in  Section  4.4. 

•  Deadline  pressure  was  also  kept  low  (it  also  makes  the  simple  algorithm  perform  badly). 

In  our  experiments,  the  centralized  parallel  scheduler  outperformed  our  distributed,  GPGP 
agents  57%  of  the  time  (36%  no  difference,  7%  distributed  was  better)  using  the  total  final 
quality  as  the  only  criterion.  The  GPGP  agents  produced  85%  of  the  quality  that  the  centralized 
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parallel  scheduler  did,  on  average.  These  results  need  to  be  understood  in  the  proper  context — 
the  centralized  scheduler  takes  much  more  processing  time  than  the  distributed  scheduler  and 
cannot  be  scaled  up  to  larger  numbers  of  methods  or  task  groups.  The  centralized  scheduler  also 
starts  with  a  global  view  of  the  entire  episode.  Table  3  shows  the  results  for  all  four  measured 
criteria  by  summarizing  within-block  (paired-response)  comparisons.  For  total  final  quality 
and  number  of  deadlines  missed,  “better”  simply  refers  to  an  episode  where  the  algorithm  in 
question  had  a  greater  total  final  quality  or  missed  fewer  deadlines,  respectively.  With  respect 
to  method  execution  time  (a  measure  of  system  load)  and  termination  time,  “better”  refers  to 
the  fact  that  one  algorithm  produced  both  a  higher  quality  and  missed  fewer  deadlines  than 
the  other  algorithm,  or  if  the  two  algorithms  were  the  same,  then  the  better  algorithm  had  a 
lower  total  method  execution  time  (lower  load)  or  terminated  sooner.® 

We  also  looked  at  performance  without  any  of  the  mechanisms;  on  the  same  300  episodes 
the  GPGP  agents  produced  on  average  1.14  times  the  final  quality  of  the  uncoordinated  agents. 
Coordinated  agents  (“balanced”)  execute  far  fewer  methods  because  of  their  ability  to  avoid 
redundancy.  The  redundant  execution  of  methods  proves  a  much  more  hindering  element  to 
the  uncoordinated  agents  when  acting  under  severe  time  pressure  [4] .  Table  4  summarizes  the 
results. 


Parallel  better 

Balanced  Better 

Same 

Significant? 

Total  Final  Quality 

57% 

7% 

36% 

yes 

Method  Execution  Time 

80% 

7% 

13% 

yes 

Deadlines  Missed 

1% 

1% 

98% 

no 

Termination  Time 

67% 

15% 

18% 

yes 

Table  3:  Performance  comparison:  Centralized  Parallel  Scheduler  vs.  Balanced  GPGP  Coordination  and  Decen¬ 
tralized  DTT  Scheduler 


Simple  better 

Balanced  Better 

Same 

Significant? 

Total  Final  Quality 

8% 

21% 

71% 

yes 

Method  Execution  Time 

12% 

72% 

16% 

yes 

Deadlines  Missed 

0% 

4% 

96% 

yes 

Termination  Time 

9% 

58% 

33% 

yes 

Table  4:  Performance  comparison:  Simple  GPGP  Coordination  vs.  Balanced  GPGP  Coordination 


®  Termination  within  two  time  units  was  considered  “the  same”  because  the  “balanced”  algorithm  has  a  fixed 
2-unit  startup  cost.  The  average  task  duration  is  10  time  units. 
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4.3  Taking  Advantage  of  a  Coordination  Relationship:  When  to  Add  a  New 
Mechanism 

A  practical  question  to  ask  is  simply  whether  the  addition  of  a  particular  mechanism  will  benefit 
performance  for  the  system  of  agents.  Here  we  give  an  example  with  respect  to  the  soft  coordi¬ 
nation  mechanism  (Mechanism  5),  which  will  make  commitments  to  facilitation  relationships. 
We  ran  234  randomly  generated  episodes  (generated  with  the  environmental  parameters  shown 
in  Table  1)  with  four  agents  both  with  and  without  the  soft  coordination  mechanism.  Because 
the  variance  between  these  randomly  generated  episodes  is  so  great,  we  took  advantage  of  the 
paired  response  nature  of  the  data  to  run  a  non-parametric  Wilcoxon  matched-pairs  signed- 
ranks  test  [3].  This  test  is  easy  to  compute  and  makes  very  few  assumptions — primarily  that  the 
variables  are  interval-valued  and  comparable  within  each  block  of  paired  responses.  For  each 
of  the  234  blocks  we  calculated  the  difference  in  the  total  final  quality  achieved  by  each  group 
of  agents  and  excluded  the  blocks  where  there  was  no  difference,  leaving  102  blocks.  We  then 
replace  the  differences  with  the  ranks  of  their  absolute  values,  and  then  replace  the  signs  on  the 
ranks.  Finally  we  sum  the  positive  and  negative  ranks  separately.  A  standardized  Z  score  is  then 
calculated.  A  small  value  of  Z  means  that  there  was  not  much  consistent  variation,  while  a  large 
value  is  unlikely  to  occur  unless  one  treatment  consistently  outperformed  the  other.  In  our 
experiment,  the  null  hypothesis  is  that  the  system  with  the  soft  coordination  mechanism  did 
the  same  as  the  one  without  it,  and  our  alternative  is  that  the  system  with  the  soft  coordination 
mechanism  did  better  (in  terms  of  total  final  quality).  The  result  here  was  Z  —  —6.9,  which  is 
highly  significant,  and  allows  us  to  reject  the  null  hypothesis  that  the  mechanism  did  not  have 
an  effect. 

4.4  Different  Family  Members  for  Different  Environments 

In  this  section  we  show  a  particular  example  of  how  different  family  members  do  better  and 
worse  in  different  environments.  We  will  concentrate  on  two  distinct  family  members — the 
modular  agent’  archetype  (all  CR  modules  on,  non-local  views,  communicate  commitments 
and  completed  task  groups),  and  the  ^simple  agent’  (no  CR  modules  on,  no  non-local  views, 
broadcast  all  completed  methods).  The  environmental  parameter  we  will  vary  (derived  from 
the  screening  data  collected  in  Section  5)  is  OAF-min,  the  percentage  of  tasks  that  have  min  as 
their  quality  accumulation  function  (AND’  semantics).  Our  hypothesis  was  that  the  modular 
agents  would  do  better  than  the  simple  agents  as  QAF-min  increased  (as  more  tasks  needed  to 
be  done).  We  ran  250  paired-response  experiments  at  5  levels  of  QAF-min  (0,  0.25,  0.5,  0.75, 
1.0)  with  enables-probability  varying  also  at  the  same  5  levels,  no  time  pressure,  overlaps  of  0.5, 
5  task  groups,  and  4  agents  per  run.  The  performance  (in  terms  of  total  final  quality)  of  the  two 
coordination  styles  was  significantly  different  by  the  Wilcoxon  matched-pairs  signed-ranks  test 
(199  different  pairs,  Z  ~  —3.27,  p  <  0.0005).  More  interestingly,  we  can  see  the  difference 
in  performance  widening  with  the  value  of  QAF-min.  Figure  5  shows  the  probability  of  one 
coordination  style  or  the  other  doing  better  (calculated  simply  from  the  frequencies)  plotted 
verses  the  value  of  QAF-min.  This  allows  you  to  see  graphically  the  difference  in  the  styles  as 
QAF-min  changes. 
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Figure  5:  Plot  of  the  probability  of  the  modular  or  simple  coordination  styles  doing  better  than  the  other  (total 
final  quality)  verses  the  probability  of  task  quality  accumulation  being  MIN  (AND-semantics) 

4.5  Meta-level  Communication:  Return  to  Load  Balancing  through  Dy¬ 
namic  Reorganization 

Another  question  we  have  examined  is  the  effect  of  task  structure  variance  on  the  performance 
of  load  balancing  algorithms.  This  work  is  a  logical  follow-on  to  the  analysis  of  static,  dynamic, 
and  negotiated  reorganization  detailed  in  [6].  A  static  organization  divides  the  load  up  a 
priori — in  the  case  below,  by  randomly  assigning  redundant  tasks  to  agents.  A  one-shot  dynamic 
reorganization,  like  that  analyzed  in  [7],  assigns  redundant  tasks  on  the  basis  of  the  expected 
load  on  other  agents.  A  meta-level  communication  (MLC)  reorganization  assigns  redundant 
tasks  on  the  basis  of  actual  information  about  the  particular  problem-solving  episode  at  hand. 
Because  it  requires  extra  communication,  the  MLC  reorganization  is  more  expensive,  but  the 
extra  information  pays  off  as  the  variance  in  static  agent  loads  grows. 

A  MLC  coordination  mechanism  (mechanism  6)  can  be  implemented  in  GPGP.  Many 
such  implementations  are  possible;  the  one  that  we  chose  works  by  altering  the  way  redundant 
commitments  are  handled.  When  a  commitment  is  sent  to  another  agent,  it  is  modified  to 
include  the  current  load  of  the  agent  making  the  commitment  (to  be  precise,  the  amount  of  work 
for  the  agent  in  the  current  schedule).  Whenever  a  decision  about  redundant  commitments 
need  to  be  made  at  another  agent  (in  mechanisms  3,  4,  and  5 — simple  redundancy,  hard,  and 
soft  successor  relationship  handling)  the  load  of  the  agents  with  the  redundant  commitments 
are  taken  into  account  at  the  point  where  ties  would  have  been  broken  randomly.  The  agent 
with  the  lowest  load  keeps  the  commitment  instead.  If  the  loads  are  equal,  the  tie  is  broken 
randomly  as  before. 

The  effect  of  this  mechanism  on  the  general  GPGP  environments  when  agents  use  the 
default  Design-To-Time  scheduler  is  minimal.  The  heuristics  used  by  the  DTT  scheduler  are 
focused  at  providing  the  highest  possible  total  final  quality  for  the  agent  without  violating 
deadlines — this  is  not  the  same  as  terminating  quickly,  and  the  scheduler  has  no  heuristics  to 
prefer  earlier  termination  times  (nor,  frankly,  should  it  have  them).  In  a  randomly-generated 
task  environment,  where  the  methods  are  assigned  to  agents  randomly  (and  therefore,  somewhat 
evenly)  there  is  rarely  any  significant  change  in  termination  time. 
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However,  if  you  recall  one  of  our  results  from  [6,  7],  you  will  remember  that  MLC 
coordination  is  most  useful  in  environments  with  high  variance  in  the  task  structures  presented 
to  agents.  We  can  look  at  our  experiments  in  this  light,  by  calculating  an  endogenous  input 
variable  for  each  run  that  represents  the  amount  of  variance  in  redundant  tasks  (the  ones  that 
would  potentially  be  eligible  for  a  load-balancing  mechanism  decision).  Figure  6  shows  how 
the  probability  of  terminating  more  quickly  with  the  MLC  load  balancing  algorithm  grows  as 
the  standard  deviation  in  the  total  durations  of  redundant  tasks  at  each  agent  grows. 


Figure  6;  Probability  that  MLC  load  balancing  will  terminate  more  quickly  than  static  load  balancing,  fitted  using 
a  loglinear  model  from  actual  T^MS  simulation  data. 


5  Exploring  the  Family  Performance  Space 

Finally,  we  looked  at  the  multidimensional  performance  space  for  the  family  of  coordination 
algorithms  over  four  different  performance  measures.  At  the  most  abstract  level,  each  of  the 
five  mechanisms  are  parameterized  independently  (the  first  two  have  three  possible  settings  and 
the  last  three  can  be  ‘in  or  ‘out’)  for  a  total  of  72  possible  coordination  algorithms.  We  applied 
two  standard  statistical  clustering  techniques  to  develop  a  much  smaller  set  of  significantly 
different  algorithms.  The  resulting  five  ‘prototypical’  combined  behaviors  are  a  useful  starting 
point  when  searching  for  an  appropriate  algorithm  family  member  in  a  new  environment. 

The  analysis  proceeded  as  follows:  we  generated  one  random  episode  in  each  of  63  ran¬ 
domly  chosen  environments,  and  ran  each  of  the  72  “agent  types”  on  the  episode  (4536  cases). 
We  collected  four  performance  measures:  total  quality,  number  of  methods  executed,  number 
of  communication  actions,  and  termination  time.  We  then  took  this  data  and  standardized 
each  performance  measure  within  an  environment.  So  now  each  measure  is  represented  as  the 
number  of  standard  deviations  from  the  mean  value  in  that  environment.  We  then  took  sum¬ 
mary  statistics  for  each  measure  grouped  by  agent  types — this  boils  the  4536  cases  (standardized 
within  each  environment)  into  72  summary  cases  (summarized  across  environments).  Each 
of  the  72  summaries  correspond  to  the  average  standardized  performance  of  one  agent- type 
for  the  four  performance  measures.  We  then  used  both  a  hierarchical  clustering  algorithm 
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Figure  7:  Standardized  Performance  by  the  5  named  coordination  styles. 


(SYSTAT  JOIN  with  complete  linkage’*^)  to  produce  the  following  general  prototypical  agent 
classes  (we  chose  one  representative  algorithm  in  each  class) : 

Simple:  No  commitments  or  non-local  view,  just  broadcasts  results. 

Myopic:  All  commitment  mechanisms  on,  but  no  non-local  view. 

Balanced:  All  mechanisms  on. 

Tough-guy:  Agent  that  makes  no  soft  commitments. 

Mute:  No  communication  whatsoever® 

Figure  7  shows  the  values  of  several  typical  performance  measures  for  only  the  five  named 
types.  Performance  measures  were  standardized  within  each  episode^  (i.e.  across  all  72  types). 
Shown  for  each  are  the  means  and  10,  25,  50,  75,  and  90  percent  quantiles.  All  algorithms’ 
performances  are  significantly  different  by  Tukey  Kramer  HSD  except  for:  Method  Execution 

Distances  are  calculated  between  the  farthest  points  in  each  cluster.  Other  distance  measures  (Euclidean, 
centroid,  or  Pearson  correlation)  gave  similar  results. 

®This  algorithm  makes  no  commitments  (mechanisms  3,  4,  and  5  off)  and  communicates  (mechanism  2) 
only  satisfied  commitments — therefore  it  sends  no  communications  ever!. 
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(Simple  vs.  Mute),  Total  final  quality  (Balanced  vs.  Tough),  Deadlines  missed  (simple  vs. 
mute)  and  (balanced  vs.  tough). 

We  are  also  analyzing  the  effect  of  environmental  characteristics  on  agent  performance. 
Figure  8  shows  an  example  of  the  effect  of  the  amount  of  overlap  (method  redundancy)  on 
the  number  of  method  execution  actions  for  the  five  named  agent  types.  Note  again  that  the 
balanced  and  tough  agents  do  significantly  less  work  when  there  is  a  lot  of  overlap  (as  would 
be  expected).  The  performance  of  the  tough  and  balanced  agents  is  similar  because  (from 
Table  1)  1)  the  algorithms  only  differ  in  the  way  that  they  handle  facilitation,  and  2)  only  half 
the  experiments  had  any  facilitation,  and  when  it  was  present  was  only  at  50%  power. 


Figure  8:  The  effect  of  overlaps  in  the  task  environment  on  the  standardized  method  execution  performance  by 
the  5  named  coordination  styles  (smoothed  splines  fit  to  the  means). 

A  linear  clustering  algorithm,  SYSTAT  KMEANS,  produces  a  similar  result  as  hierar¬ 
chical  clustering,  and  also  produces  the  mean  value  of  each  performance  measure  for  each 
group.  For  example,  the  non-communicating  agents  have  a  high  negative  mean  "number-of- 
communications"  (-1.16;  remember  these  were  averaged  from  standardized  scores)  but  execute 
more  methods  on  average  and  produce  less  final  quality.  They  also  terminate  slightly  quicker 
than  average.  The  "balanced"  group,  in  comparison,  communicates  a  little  more  than  average, 
executes  many  fewer  methods  (-1 .29 — way  out  on  the  edge  of  this  statistic),  returns  better- than- 
average  quality  and  about  average  termination  time.  This  is  reasonable,  as  avoiding  redundant 
work’  and  other  work- reducing  ideas  are  a  key  feature  of  the  original  PGP  algorithm  replicated 
by  this  set  of  mechanisms. 

6  Conclusions  and  Future  Work 

This  paper  discusses  the  design  of  an  extendable  family  of  scheduling  coordination  mechanisms, 
called  Generalized  Partial  Global  Planning  (GPGP),  that  form  a  basic  set  of  coordination 


21 


mechanisms  for  teams  of  cooperative  computational  agents.  An  important  feature  of  this 
approach  includes  an  extendable  set  of  modular  coordination  mechanisms,  any  subset  or  all 
of  which  can  be  used  in  response  to  a  particular  task  environment.  This  subset  may  be 
parameterized,  and  the  parameterization  does  not  have  to  be  chosen  statically,  but  can  instead 
be  chosen  on  a  task-group-by-task-group  basis  or  even  in  response  to  a  particular  problem¬ 
solving  situation.  For  example.  Mechanism  5  (Handle  Soft  Predecessor  CRs)  might  be  on 
for  certain  classes  of  tasks  and  off’  for  other  classes  (that  usually  have  few  or  very  weak  soft 
CRs).  The  general  specification  of  the  GPGP  mechanisms  involves  the  detection  and  response 
to  certain  abstract  coordination  relationships  in  the  incoming  task  structure  that  were  not  tied 
to  a  particular  domain.  We  have  used  T^MS  to  model  a  simple  distributed  sensor  network 
problem,  the  original  DVMT  domain,  and  a  hospital  scheduling  environment.  A  careful 
separation  of  the  coordination  mechanisms  from  an  agent’s  local  scheduler  allows  each  to  better 
do  the  job  for  which  it  was  designed.  We  believe  this  separation  is  not  only  useful  for  applying 
our  coordination  mechanisms  to  problems  with  existing,  customized  local  schedulers,  but  also 
to  problems  involving  humans  (where  the  coordination  mechanism  can  act  as  an  interface 
to  the  person,  suggesting  possible  commitments  for  the  person’s  consideration  and  reporting 
non-local  commitments  made  by  others). 

The  GPGP  coordination  approach  as  described  in  this  paper  has  been  fully  implemented 
in  the  T^EMS  simulation  testbed.  Significant  experimental  validation  of  the  GPGP  approach  is 
documented  in  [4] .  This  paper  showed  how  to  decide  if  the  addition  of  a  new  GPGP  mechanism 
was  useful.  It  showed  the  general  performance  of  two  GPGP  family  algorithms  compared  to  a 
centralized  parallel  reference  algorithm;  GPGP  with  all  mechanisms  on  produces  85%  of  the 
quality  of  the  centralized  reference  scheduler  in  a  random  environment.  Such  performance 
is  reasonable  and  we  feel  could  be  made  even  better  by  developing  better  local  scheduling 
algorithms  and  new  coordination  mechanisms. 

We  also  demonstrated  how  a  feature  of  the  task  environment  (the  probability  of  task  quality 
accumulation  being  MAX)  can  cause  different  GPGP  family  members  to  be  preferred.  We 
also  discussed  a  sixth  mechanism,  a  load  balancing  mechanism  that  communicates  meta-level 
information,  and  showed  that  it  was  somewhat  more  useful  when  the  variance  in  duration  of 
the  agents’  overlapping  tasks  was  high.  This  section  thus  ties-in  back  to  the  discussion  in  [6,  7] 
on  the  usefulness  of  meta-level  communication  (in  this  case,  the  transmission  of  local  load 
information)  when  the  inter-episode  variance  (in  this  case,  in  the  initial  agent  loads)  is  high. 

Finally,  we  gave  a  sense  of  the  performance  space  of  the  five  broadly-parameterized  mech¬ 
anisms  using  a  clustering  technique.  Clustering  can  be  a  useful  method  for  dealing  with  large 
algorithm  spaces  to  prune  search  for  an  appropriate  combination  of  mechanisms.  Such  meth¬ 
ods  may  also  lead  to  ways  to  learn  situation-specific  knowledge  about  the  application  of  certain 
mechanisms  in  certain  situations  (perhaps  using  case-based  reasoning  techniques). 

We  believe  that  GPGP  can  become  a  reusable,  domain-independent  basis  for  multi-agent 
coordination  when  used  in  conjunction  with  a  library  of  coordination  mechanisms  and  a 
learning  mechanism.  We  intend  to  develop  such  a  library  of  reusable  coordination  mechanisms. 
For  example,  mechanisms  that  work  from  the  successors  of  hard  and  soft  relationships  instead 
of  the  predecessors,  negotiation  mechanisms,  mechanisms  for  behavior  such  as  contracting,  or 
mechanisms  that  can  be  used  by  self-motivated  agents  in  non-cooperative  environments.  Many 
of  these  mechanisms  can  be  built  on  the  existing  work  of  other  DAI  researchers.  Future  work  will 
also  examine  expanding  the  parameterization  of  the  mechanisms  and  using  machine  learning 
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techniques  to  choose  the  appropriate  parameter  values  (i.e.,  learning  the  best  mechanism  set 
for  an  environment).  Finally,  we  are  also  beginning  work  on  using  the  GPGP  approach  in 
applications  ranging  from  providing  human  coordination  assistance  to  distributed  information 
gathering. 
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Abstract 

In  our  research,  we  explore  the  role  of  negotia¬ 
tion  for  conflict  resolution  in  distributed  search 
among  heterogeneous  and  reusable  agents.  We 
present  negotiated  search^  an  algorithm  that  ex¬ 
plicitly  recognizes  and  exploits  conflict  to  direct 
search  activity  across  a  set  of  agents.  In  nego¬ 
tiated  search,  loosely  coupled  agents  interleave 
the  tasks  of  1)  local  search  for  a  solution  to 
some  subproblem;  2)  integration  of  local  sub¬ 
problem  solutions  into  a  shared  solution;  3) 
information  exchange  to  deflne  and  refine  the 
shared  search  space  of  the  agents;  and  4)  assess¬ 
ment  and  reassessment  of  emerging  solutions. 
Negotiated  search  is  applicable  to  diverse  ap¬ 
plication  areas  and  problem-solving  environ¬ 
ments.  It  requires  only  basic  search  operators 
and  allows  maximum  flexibility  in  the  distribu¬ 
tion  of  those  operators.  These  qualities  make 
the  algorithm  particularly  appropriate  for  the 
integration  of  heterogeneous  agents  into  appli¬ 
cation  systems.  The  algorithm  is  implemented 
in  a  multi- agent  framework,  TEAM,  that  provides 
the  infrastructure  required  for  communication 
and  cooperation. 

1  Introduction 

The  current  state  of  knowledge-based  technology  is 
such  that  almost  every  application  system  is  built  from 
scratch.  In  order  to  move  beyond  the  prohibitive  cost  of 
constantly  reinventing,  rerepresenting,  and  reimplement¬ 
ing  the  wheel,  researchers  are  beginning  to  examine  the 
feasibility  of  building  application  systems  with  reusable 
agents  [Neches  et  al^  1991],  A  reusable  agent  is  designed 
to  work  without  a  priori  knowledge  of  the  agent  set  in 
which  it  will  be  embedded,  instead  using  a  flexible,  reac¬ 
tive  approach  to  cooperation.  Although  this  flexibility 
can  lead  to  inefficient  problem  solving,  an  agent  can  of¬ 
ten  gather  information  about  the  agent  set  as  problem 
solving  progresses  to  improve  efficiency. 

This  research  was  supported  by  ARPA  under  ONR  Contract 
#N00014-92-J-1698.  The  content  of  the  information  does  not 
necessarily  reflect  the  position  or  the  policy  of  the  Govern¬ 
ment,  and  no  official  endorsement  should  be  inferred. 


Multi- agent  systems  do  not  traditionally  acknowledge 
the  role  of  conflict  among  agents  as  a  driving  force  in 
the  control  of  problem-solving  activity.  In  reusable- agent 
systems,  however,  conflict  is  inevitable  since  agents  are 
implemented  at  different  times  by  different  people  and 
in  different  environments.  We  present  a  distributed- 
search  algorithm,  negotiated  search^  that  uses  conflict  as 
a  source  of  control  information  for  directing  search  ac¬ 
tivity  across  a  set  of  heterogeneous  agents  in  their  quest 
for  a  mutually  acceptable  solution. 

The  negotiated-search  algorithm  has  been  successfully 
incorporated  into  two  implemented  systems.  In  [Lan¬ 
der  and  Lesser,  1992b],  we  describe  distributed  search 
in  the  context  of  a  seven- agent  steam  condenser  de¬ 
sign  system  and  discuss  how  different  operator /agent 
assignments  within  the  negotiated-search  algorithm  af¬ 
fect  problem  solving.  In  [Lander  and  Lesser,  1992a], 
a  two-agent  contract  negotiation  system  is  presented, 
and  negotiated  search  is  compared  to  a  search  strat¬ 
egy  that  is  tailored  to  characteristics  of  that  environ¬ 
ment.  Through  analysis  of  the  environment  and  search 
algorithms,  we  show  the  versatility  and  effectiveness  of 
negotiated  search  in  reusable- agent  systems  while  also 
pointing  out  that  customized  search  strategies  are  in¬ 
flexible  but  can  improve  system  performance  when  they 
can  be  applied.  In  this  paper,  we  describe  negotiated 
search  from  an  application-independent  perspective. 

The  need  for  a  flexible  algorithm  to  support  reusability 
and  heterogeneity  motivates  particular  aspects  of  nego¬ 
tiated  search: 

•  Conflict,  negotiation,  and  democratic  determination 
of  acceptability  are  integral  parts  of  the  algorithm. 

•  Agent  coordination  is  accomplished  through  clearly 
defined  individual  roles  in  the  evolution  of  a  shared 
solution.  These  roles  are  realized  as  operators  that 
accomplish  state  transitions  on  shared  solutions. 

•  Operators  represent  standard  and  widely  available 
search  and  information- assimilation  capabilities.  A 
particular  agent  may  instantiate  all  defined  opera¬ 
tors  or  some  subset  of  defined  operators. 

•  Whenever  possible,  feedback  is  used  to  refine  the 
perceived  search  spaces  of  individual  agents  to  more 
closely  reflect  the  true  composite  search  space. 

TEAM  agents  are  not  hostile  and  will  not  intentionally 


mislead  or  otherwise  try  to  sabotage  another  agent’s  rea¬ 
soning.  They  are  cooperative  in  the  sense  that  an  agent 
is  willing  to  contribute  both  knowledge  and  solutions 
to  other  agents  as  appropriate  and  to  accept  solutions 
that  are  not  locally  optimal  in  order  to  find  a  mutually- 
acceptable  solution.  Each  agent  is  a  stand-alone  system 
with  specific  capabilities  that  allow  it  to  be  included 
in  an  integrated  multi-agent  system.  We  assume  that 
agents  can  be  heterogeneous  in  architecture,  inference 
engines,  evaluation  criteria  and  priorities  for  solutions, 
and  in  long-term  knowledge.  Each  agent  does  its  own  in¬ 
ternal  scheduling  and  has  private  data,  knowledge,  and 
history  mechanisms. 

In  negotiated  search,  agents  interleave  the  tasks  of  1) 
local  search  for  a  solution  to  some  subproblem;  2)  inte¬ 
gration  of  local  subproblem  solutions  into  a  shared  solu¬ 
tion  (the  composite  solution);^  3)  negotiation  to  define 
and  refine  the  shared  search  space  of  the  agents;  and  4) 
assessment  and  reassessment  of  emerging  solutions. 

In  the  remainder  of  the  paper,  we  first  motivate  the 
development  of  our  negotiated-search  model  by  present¬ 
ing  an  intuitive  description  of  negotiation  and,  from  this 
foundation,  constructing  an  algorithmic  model  of  the  ne¬ 
gotiation  process.  The  next  section  details  negotiated 
search  from  a  state-based  perspective  similar  to  that 
used  by  von  Martial  to  describe  negotiation  protocols 
in  distributed  planning  [von  Martial,  1992].  We  then 
present  seven  basic  negotiated-search  operators.  The  fi¬ 
nal  section  briefly  describes  the  status  of  the  implemen¬ 
tation  and  extensions  to  this  model  that  are  not  covered 
in  this  paper. 

2  An  Initial  Perspective  on  Negotiation 

In  this  section,  we  begin  with  an  intuitive  description  of 
negotiation: 

One  agent  generates  a  proposal  and  other 
agents  review  it.  If  some  other  agent  doesn’t 
like  the  proposal,  it  rejects  it  and  provides  some 
feedback  about  what  it  doesn’t  like.  Some 
agent  may  generate  a  counter-proposal.  If 
so,  the  other  agents  (including  the  agent  that 
generated  the  first  proposal)  then  review  the 
counter-proposal  and  the  process  repeats.  As 
information  is  exchanged,  conflicts  become  ap¬ 
parent  among  the  agents.  Agents  may  respond 
to  the  conflicts  by  incrementally  relaxing  indi¬ 
vidual  preferences  until  some  mutually  accept¬ 
able  ground  is  reached. 

This  example  captures  the  primary  characteristics  that 
one  would  expect  to  see: 

•  proposals  are  generated  by  one  or  more  agents 

•  agents  evaluate  proposals  based  on  their  individual 
criteria  for  solution  acceptability 

•  agents  provide  feedback  about  what  they  like  or 
don’t  like  about  particular  proposals,  resulting  in 
a  progressively  better  understanding  of  the  shared 
requirements  for  solutions  over  time 

^Sathi  similarly  uses  the  term  composition  as  the  name 
of  a  specific  search  operator  that  combines  local  informa¬ 
tion  [Sathi  and  Fox,  1989] 


•  agents  can  play  different  roles  in  the  negotiation 
process,  e.g.,  an  agent  can  be  a  reviewer  for  an¬ 
other  agent’s  proposal  and  then  be  a  generator  for 
a  counter-proposal 

•  conflicts  exist  among  the  agents’  requirements  for 
acceptable  solutions 

•  agents  incrementally  relax  their  solution  require¬ 
ments  to  reach  agreement 

•  the  decision  to  accept  or  not  accept  a  proposal  is  a 
joint,  democratic  process 

Some  extensions  to  the  definition  are  required.  For 
example,  it  assumes  that  a  proposal  becomes  a  solution 
when  it  is  accepted  by  all  agents.  However,  this  assump¬ 
tion  rules  out  situations  in  which  high-level  problems 
are  decomposed  and  each  agent  works  on  some  subprob¬ 
lem.  In  this  case,  the  proposal  an  agent  makes  does  not 
represent  a  complete  solution  but  rather  some  compo¬ 
nent  of  a  solution  that  interacts  with  other  components 
through  shared  attributes.  Evaluation  is  then  indirect 
since  an  agent  cannot  evaluate  proposals  for  interact¬ 
ing  components  that  are  outside  of  its  domain  of  exper¬ 
tise.  In  negotiated  search,  an  agent  evaluates  an  external 
inter  acting-component  proposal  by  creating  and  evalu¬ 
ating  a  compatible  local  proposal  (i.e.,  one  that  has  the 
same  values  for  shared  attributes),  thereby  focusing  on 
how  the  external  proposal  affects  local  quality. 

Although  a  proposal  includes  the  information  required 
to  implement  a  solution,  it  provides  only  a  surface-level 
view  of  the  reasoning  that  went  into  creating  it.  It  is 
sometimes  possible  to  make  guesses  about  other  agents’ 
requirements  that  could  be  used  in  generating  counter¬ 
proposals.  However,  in  the  general  case  of  reusable 
agents,  external  local  evaluation  criteria  for  solutions 
cannot  be  predicted,  nor  can  they  be  inferred  from  the 
“snapshot”  provided  by  a  proposal.  For  proposals  and 
counter-proposals  to  be  related,  there  must  be  a  deeper 
understanding  of  the  shared  search  space  of  the  agents. 
This  understanding  is  achieved  through  a  feedback  sys¬ 
tem  that  can  be  separate  from  the  proposals. 

3  Negotiated  Search 

Artificial  intelligence  researchers  have  previously  used 
the  term  negotiation  with  respect  to  conflict  resolu¬ 
tion  and  avoidance  [Adler  et  aL,  1989,  Klein,  1991, 
Lander  and  Lesser,  1992a,  Sycara,  1985,  Werkman, 
1992],  task  allocation  [Cammarata  et  a/.,  1983,  Durfee 
and  Montgomery,  1990,  Davis  and  Smith,  1983],  and  re¬ 
source  allocation  [Adler  et  al,  1989,  Conry  et  al^  1992, 
Sathi  and  Fox,  1989,  Sycara  et  al,  1991].  Negotiation 
is  sometimes  treated  as  an  independent  process  that  is 
used  to  select  one  of  a  set  of  existing  alternative  solu¬ 
tions  [Zlotkin  and  Rosenschein,  1990]  rather  than  as  an 
inherent  part  of  a  solution-generation  process.  It  can  be 
difficult  under  conditions  where  agents  are  hostile  and 
unwilling  to  share  private  information  [Sycara,  1985]. 
Negotiation  can  occur  among  peers  [Cammarata  et  aL^ 
1983,  Lander  and  Lesser,  1992b],  through  a  mediator 
or  arbitrator  [Sycara,  1985,  Werkman,  1992],  or  hier¬ 
archically  through  an  organization  [Durfee  and  Mont¬ 
gomery,  1990,  Davis  and  Smith,  1983].  It  can  occur  at 


either  the  domain  or  control  level  of  problem-solving. 
Laasri  et,  al.  describe  the  recursive  negotiation  models  a 
general  model  of  multi-agent  problem  solving  that  details 
various  situations  that  can  potentially  benefit  from  nego¬ 
tiation  [Laasri  et  al^  1992].  In  examining  this  model,  it 
becomes  clear  that  negotiation  is  a  pervasive  process  that 
remains  relatively  untapped  by  current  computational 
systems.  In  developing  the  negotiated-search  model,  we 
have  tried  to  capture  the  key  requirements  for  negotia¬ 
tion  without  restricting  the  domain,  task  decomposition, 
or  organizational  model  of  the  agent  set. 

Several  researchers  have  developed  algorithms  and 
heuristics  for  constraint-directed  distributed  search  in 
situations  involving  multiple  homogeneous  agents  [Sathi 
and  Fox,  1989,  Sycara  et  aL,  1991,  Yokoo  et  al.^ 
1992].^  We  extend  this  work  to  handle  situations  where 
heterogeneous  agents  may  have  different  or  multiple  local 
problem-solving  paradigms,  instantiate  different  search 
operators,  and  where  agents  may  not  be  able  to  pro¬ 
vide  specific  information  to  other  agents  or  understand 
information  received  from  other  agents.  The  negotiated- 
search  algorithm  is  particularly  suitable  to  this  style  of 
problem  solving  because  1)  the  required  search  opera¬ 
tors  represent  standard  search  capabilities;  2)  the  search 
operators  can  be  flexibly  assigned  across  the  agent  set 
according  to  the  search  capabilities  of  each  agent;  and 
3)  agents  use  incremental  relaxation  of  solution  require¬ 
ments  to  reach  mutual  acceptability  as  an  inherent  part 
of  problem  solving. 

3.1  The  Search  Process 

Search  is  initiated  by  a  problem  specification  that  de¬ 
tails  the  form  of  a  solution  and  values,  preferences,  or 
constraints  on  some  attributes  of  that  solution.  This 
specification  is  placed  in  a  centralized  shared  memory  as 
are  emerging  composite  solutions.^  Some  agent (s)  uses 
constraining  information  from  the  specification  and  its 
local  solution  requirements  to  propose  an  initial  partial 
solution  called  a  base  proposal.  The  base  proposal  is  then 
extended  and  evaluated  by  other  agents  during  future 
processing  cycles.  When  a  particular  solution  cannot  be 
extended  by  some  agent  due  to  conflicts  with  existing  so¬ 
lution  attributes,  there  are  two  possible  outcomes:  1)  if 
the  confiict  is  caused  by  the  violation  of  some  hard  (non- 
relaxable)  requirement,  the  solution  path  is  pruned  (e.g., 
arc  5  in  Figure  1);  or  2)  if  the  conflict  is  caused  by  the  vi¬ 
olation  of  some  soft  (relaxable)  solution  requirement,  the 
solution  is  saved  and  viewed  as  a  potential  compromise 
(e.g,  arc  9  in  Figure  1).  In  the  first  case,  no  more  work 
will  be  done  on  that  solution,  and,  to  the  extent  that  the 
violated  requirement  can  be  communicated  to  and  assim¬ 
ilated  by  other  agents,  future  counter-proposals  will  not 
violate  that  same  requirement.  In  the  second  case,  the 
violated  requirement  may  eventually  be  relaxed  and,  if 
that  happens,  the  potential  compromise  will  become  a 

^Agents  may  control  different  resources  and  have  different 
constraints  on  solutions,  but  they  share  a  single  underlying 
problem-solving  paradigm  and  knowledge  representation. 

^Each  agent  also  has  a  local  short-term  memory  where  it 
stores  intermediate  results  and/or  component  proposals  that 
are  linked  to  composite  solutions  in  shared  memory. 


viable  solution  again.  Future  counter-proposals  will  take 
the  violated  requirement  into  account  but  are  not  guar¬ 
anteed  to  avoid  the  same  conflict,  since  other  alternatives 
may  be  worse. 

In  both  of  the  above  cases,  confiict  is  used  as  the  trig¬ 
ger  for  the  communication  of  feedback  information.  In 
multi-agent  systems,  it  is  always  problematic  to  decide 
what  information  should  be  exchanged  and  when  that 
exchange  should  take  place.  In  general,  agents  want  to 
minimize  the  amount  of  information  they  share  since  it 
is  expensive  both  to  communicate  information  and  to 
assimilate  information.  On  the  other  hand,  sharing  in¬ 
formation  that  will  specifically  help  another  agent  avoid 
future  conflicts  is  generally  cost  effective  since  it  elim¬ 
inates  the  expense  of  generating  unproductive  solution 
paths  [Lander,  1993].  In  negotiated  search,  an  agent 
that  receives  conflict  information  from  another  agent  can 
choose  whether  or  not  to  prune  its  own  search  to  respect 
that  information  (see  Section  4.5). 

Multiple  solution  paths  can  be  concurrently  investi¬ 
gated  in  negotiated  search.  Agents  are  free  to  initiate 
solutions  at  any  time  either  because  there  aren’t  any 
promising  solutions  in  the  current  solution  set  or  because 
they  have  no  other  work  to  do.  Advantages  to  main¬ 
taining  multiple  paths  include  exploiting  the  potential 
for  concurrent  activity  and  having  the  ability  to  directly 
compare  different  potential  compromises.  There  are  dis¬ 
advantages  to  concurrently  exploring  multiple  solution 
paths  however:  there  will  be  multiple  partial  solutions 
that  have  to  be  stored  at  all  times,  requiring  additional 
memory  resources.  There  is  also  overhead  involved  in 
focusing  on  a  promising  solution  path  at  a  particular 
point  in  problem  solving,  both  from  the  local  and  global 
perspectives,  and  in  managing  the  links  between  solu¬ 
tion  components  along  each  path.  The  number  of  open 
solution  paths  is  highly  dependent  on  the  domain,  the 
number  of  agents,  and  the  control  policies  of  individual 
agents.  This  number  can  be  controlled  through  param¬ 
eter  settings  in  TEAM  and  through  the  specification  of 
which  negotiated-search  operators  will  be  active  for  each 
agent  in  the  agent  set. 

3.2  A  State-Based  View  of  Negotiated  Search 

Figure  1  provides  a  state-based  view  of  the  transition 
of  a  composite  (shared)  solution  from  its  initial  state  (a 
problem  specification)  to  a  termination  state  (an  infea¬ 
sible  solution,  an  unacceptable  solution,  or  a  complete 
acceptable  solution).  In  this  figure,  states  are  defined 
in  terms  of  three  attributes  of  composite  solutions:  ac¬ 
ceptability^  completeness.)  and  search-state.  The  possi¬ 
ble  values  for  acceptability  are  acceptable.^  unacceptable^ 
and  infeasible.  Possible  values  for  completeness  are  com¬ 
plete  and  incomplete.  Note  that  complete  means  that  all 
agents  have  had  the  opportunity  to  extend  or  critique 
the  solution.  A  solution  with  all  required  components 
can  still  be  waiting  for  critiques  from  other  agents  and  is 
not  considered  complete  in  that  case.  Search-state  can 
take  the  values  initial  or  closed. 

A  negotiated-search  operator  is  a  search  function  ap¬ 
plied  by  an  agent.  Each  operator  has  a  generic  form  that 
is  expressed  in  an  agent  language  defined  by  TEAM,  spec- 
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Figure  1:  A  State-Based  View  of  Negotiated  Search 


ifying  its  inputs,  outputs,  and  functionality.  The  deci¬ 
sion  to  apply  a  particular  operator  to  a  problem-solving 
situation  is  made  by  an  agent  within  its  local  view  of 
the  problem-solving  situation.  The  arcs  in  Figure  1  are 
negotiated-search  operators  that  can  be  applied  by  some 
agent  to  a  solution. 

Each  agent  instantiates  one  or  more  of  the  negotiated- 
search  operators:  initiate- solution^  extend- solution^ 
critique-solution^  and  relax- solution-requirement.  In  ad¬ 
dition,  TEAM  instantiates  the  terminate- search  operator. 
These  operators  will  be  described  in  detail  below,  but 
we  provide  an  overview  here  to  provide  a  sense  of  their 
functionality.  Initiate-solution  is  applied  by  an  agent  to 
generate  a  base  proposal  that  will  be  used  as  the  basis 
for  a  new  composite  solution.  Extend- solution  is  applied 
by  an  agent  to:  1)  add  a  component  proposal  to  a  com¬ 
posite  solution;  2)  evaluate  the  composite  solution  from 
a  local  perspective;  and  3)  provide  feedback  information 
if  conflicts  are  detected.  Critique- solution  is  applied  to: 
1)  evaluate  a  composite  solution  (without  generating  a 
component  proposal);  and  2)  provide  feedback  informa¬ 
tion  if  conflicts  are  detected.  Relax-solution-requirement 
is  applied  to:  1)  select  a  local  requirement  to  relax;  2) 
update  the  local  database  to  effect  the  relaxation;  and 
3)  reevaluate  existing  solutions  in  light  of  the  relaxation. 
Terminate- search  is  applied  by  TEAM  to  change  the  state 
of  the  problem  solving  from  initial  to  closed^  thereby 
changing  the  termination  status  of  solutions. 

The  negotiated-search  algorithm  is  applied  by  a  set  of 
agents.  A,  Let  A  =  {Al^  A2,  A3}  and  assume  that  A1 
initiates  a  solution,  A2  extends  the  solution,  and  ^3  cri¬ 
tiques  some  aspect  of  that  solution.  We  examine  a  typi¬ 
cal  search  in  which  a  conflict  occurs.  A1  first  applies  the 
operator  initiate- solution  to  a  problem  specification  and 
produces  a  partial  acceptable  solution  (arc  1),  Then  A2 
applies  extend- solution  without  detecting  a  conflict.  Al¬ 


though  the  solution  now  has  all  components  specified,  it 
is  not  complete  until  all  critiques  have  also  been  received. 
Therefore  the  solution  is  now  partial  and  acceptable  {arc 
3)  .  A3  next  applies  critique- solution^  detects  a  conflict, 
and  evaluates  the  solution  as  unacceptable  {arc  8).  This 
solution  remains  as  it  is  for  some  amount  of  time  while 
the  agents  are  working  on  other  solution  paths.  When 
further  search  fails  to  produce  an  acceptable  solution, 
A3  decides  to  relax  the  requirement  that  made  this  solu¬ 
tion  unacceptable.  The  solution  is  now  acceptable  to  A3 
and,  since  it  was  already  complete,  it  reaches  the  termi¬ 
nation  state  of  complete  acceptable  solution  {arcl5).  In 
this  way,  various  paths  through  the  state  diagram  can 
be  achieved  by  the  agent  set. 

Although  the  above  example  describes  a  sequential  or¬ 
dering  of  operator  applications,  TEAM  permits  concur¬ 
rency  except  where  there  are  domain-dependent  opera¬ 
tor  preconditions  that  force  sequential  execution.  Con¬ 
currency  requires  that  TEAM  have  mechanisms  for  han¬ 
dling  conflicts  that  occur  due  to  the  simultaneous  de¬ 
velopment  of  extending  proposals  and  criticisms.  These 
mechanisms  are  discussed  in  [Lander,  1993]. 

4  Negotiated  Search  Operators 

In  this  section,  we  present  a  detailed  description  of  the 
negotiated-search  operators.  Notice  that  the  operators 
depicted  in  Figure  1  work  at  the  surface  level  of  problem 
solving:  they  move  a  particular  solution  through  various 
states  to  a  termination  state.  They  do  not  address  the 
issue  of  feedback  and  its  effect  on  problem  solving.  Later 
in  this  section,  we  will  present  two  operators  that  an 
agent  applies  to  assimilate  conflict  information  into  its 
knowledge  base,  thereby  refining  its  view  of  the  search 
space. 


4.1  Initiate- Solution 

Initiate- solution  is  the  basic  operator  for  initiating  solu¬ 
tions.  It  is  applied  within  the  agent’s  view  of  solution 
requirements:  local  requirements,  those  imposed  by  the 
problem  specification,  and  any  known  external  require¬ 
ments  learned  from  other  agents.  Given  these  require¬ 
ments,  it  creates  the  base  proposal.  Initiate- solution  is 
executed  by  one  or  more  agents  at  system  start-up  time, 
and  may  be  repeatedly  executed  as  earlier  proposed  so¬ 
lutions  are  rejected  by  other  agents  or  if  alternative  solu¬ 
tions  are  desired.  If  earlier  solutions  have  been  proposed 
and  rejected,  the  initiating  agent  may  have  received  con¬ 
flict  information  that  will  influence  the  generation  of  new 
base  proposals. 

At  least  one  agent  must  instantiate  initiate-solution] 
however,  instantiating  it  at  multiple  agents  is  likely  to 
result  in  a  more  diverse  set  of  solution  paths  and  more 
thorough  coverage  of  the  composite  solution  space.  De¬ 
pending  on  characteristics  of  the  agents  and  agent  set, 
it  may  also  have  a  distracting  effect.  Trade-offs  between 
coverage  and  distraction  are  a  ubiquitous  problem  in  dis¬ 
tributed  systems  and  are  discussed  generally  in  [Lesser 
and  Erman,  1980]  and  specifically  with  respect  to  nego¬ 
tiated  search  in  [Lander  and  Lesser,  1992b]. 

When  no  base  proposal  can  be  found  under  the  exist¬ 
ing  set  of  requirements,  an  agent  can  relax  requirements 


to  expand  the  search  space.  If  there  are  requirements 
on  solutions  that  come  from  information  communicated 
by  another  agent  (external  requirements),  the  initiating 
agent  can  ignore  one  or  more  of  these  requirements  in 
its  own  search.  Notice  that  the  other  agent  does  not  ac¬ 
tually  relax  the  requirements.  In  this  way,  each  agent 
chooses  the  set  of  requirements,  both  internal  and  ex¬ 
ternal,  it  will  attempt  to  satisfy.  When  known  exter¬ 
nal  requirements  are  violated,  the  proposal  is  suggested 
as  a  possible  compromise  rather  than  a  fully  acceptable 
solution.  The  external  agent  that  has  its  requirements 
violated  in  the  compromise  proposal  cannot  be  forced 
to  accept  it.  Because  the  selection  of  a  mutually  ac¬ 
ceptable  solution  is  democratic,  each  agent  votes  on  the 
acceptability  of  a  solution.  The  external  agent  that  has 
the  violated  requirement  (s)  can  initially  vote  that  the 
solution  is  unacceptable  but,  if  it  does  not  find  a  bet¬ 
ter  alternative,  it  may  eventually  agree  to  accept  this 
compromise. 

If  there  are  no  relaxable  external  solution  requirements 
or  if  the  external  requirements  are  inflexible,  an  agent 
can  relax  some  local  requirement.  If  no  base  proposal  can 
be  found  at  any  level  of  external  or  internal  requirement 
relaxation,  the  agent  returns  a  failure  along  with  any 
conflict  information  it  can  generate  that  describes  why 
it  failed.  TEAM  returns  a  failure  if  no  agent  can  generate 
a  new  base  proposal  and  all  previously  created  solutions 
have  been  found  to  be  infeasible, 

4,2  Critique-Solution  and  Extend-Solution 

The  critique-solution  operator  is  applied  by  an  agent  to 
evaluate  a  partially  or  fully  specified  composite  solution. 
The  extend-solution  operator  is  applied  by  an  agent  to 
extend  and  evaluate  a  partially  specified  composite  so¬ 
lution.  These  two  operators  will  be  described  jointly  be¬ 
cause  of  their  similarity.  The  input  for  these  operators  is 
a  composite  solution  that  was  initiated  by  another  agent. 
The  output  for  critique- solution  is  an  evaluation,  and 
when  a  conflict  is  detected,  conflict  information.  The 
output  for  extend-solution  is  a  proposal,  an  evaluation, 
and,  when  a  conflict  exists,  conflict  information. 

The  extend-solution  operator  is  required  in  domains 
where  solutions  comprise  interacting  components  and 
each  component  is  developed  by  an  expert  agent.  The 
component  that  an  agent  develops  with  extend-solution 
must  be  compatible  with  the  solution  being  extended  (it 
must  have  the  same  values  for  solution  variables  that 
overlap).  The  agent  executing  the  operator  searches  for 
a  compatible  proposal  under  its  known  solution  require¬ 
ments  and  the  requirements  imposed  by  the  assigned  pa¬ 
rameter  values  of  the  solution  to  be  extended. 

Although  we  will  not  discuss  critique-solution  fur¬ 
ther,  the  following  discussion  of  extend-solution  gen¬ 
erally  applies  to  both  operators,  except  that  critique- 
solution  evaluates  the  existing  composite  solution  rather 
than  creating  and  evaluating  a  compatible  proposal.  In 
extend-solution^  if  a  compatible  proposal  is  found  that 
does  not  violate  any  local  solution  requirements,  it  is  re¬ 
turned  as  an  acceptable  proposal.  If  the  best  compatible 
proposal  found  violates  some  relaxable  (soft)  local  solu¬ 
tion  requirements  (where  the  best  proposal  is  one  that 


maximizes  local  evaluation),  it  is  returned  as  unaccept¬ 
able  along  with  information  that  describes  the  conflict. 
Although  currently  unacceptable,  future  requirement  re¬ 
laxations  may  change  its  status  and,  therefore,  the  so¬ 
lution  is  saved  as  a  potential  compromise.  In  the  final 
case,  no  compatible  proposal  can  be  found  without  vio¬ 
lating  nonrelaxable  (hard)  requirements  of  the  executing 
agent.  In  this  case,  the  agent  fails  and  the  solution  path 
is  marked  as  infeasible.  Conflict  information  is  returned 
whenever  possible  that  describes  why  the  path  is  infea¬ 
sible,  i.e.,  what  hard  requirements  were  violated. 

4.3  Relax-Solution- Requirement 

Relaxation  of  solution  requirements  is  a  necessary  part  of 
negotiated  search.  In  order  to  terminate  problem  solv¬ 
ing,  agents  must  reach  mutual  acceptability  on  one  or 
more  solutions.  Acceptability  is  defined  as  an  attribute 
of  a  composite  solution  as  shown  in  Figure  1.  If  any 
agent  locally  evaluates  a  solution  as  unacceptable,  the 
solution  is  considered  globally  unacceptable.  However, 
as  can  be  seen  in  that  figure,  a  solution  that  is  unac¬ 
ceptable  at  some  point  in  time  can  later  become  accept¬ 
able  when  the  agent  or  agents  that  reject  it  relax  their 
solution  requirements. 

There  are  three  primary  forms  of  relaxation,  unilat¬ 
eral  relaxation.,  feedback-based  relaxation.,  and  problem- 
state  relaxation.  Unilateral  relaxation  occurs  when  an 
agent  decides  to  relax  a  requirement  due  to  its  inability 
to  find  a  solution  under  the  problem  specification,  i.e., 
the  agent  finds  that,  given  the  problem  specification  and 
its  initial  solution  requirements,  it  cannot  produce  a  lo¬ 
cally  acceptable  proposal.  This  situation  occurs  in  the 
application  of  the  initiate-solution  operator  as  described 
in  Section  4.1. 

Feedback-based  relaxation  occurs  when  an  agent  re¬ 
laxes  a  solution  requirement  because  of  some  explicit 
information  about  the  requirements  of  some  other 
agent(s),  i.e,  a  conflict  is  found  between  relaxable  local 
solution  requirements  and  less  flexible  external  solution 
requirements.  This  occurs  when  external  information 
has  been  received  by  an  agent  and  is  being  assimilated 
as  described  in  Section  4,5, 

Problem-state  relaxation  is  a  reaction  to  the  lack  of 
overall  problem-solving  progress.  In  the  current  TEAM 
framework,  problem-state  relaxation  occurs  at  specific 
processing-cycle  intervals:  for  example,  all  agents  may 
relax  a  solution  requirement  after  10  processing  cycles. 
Alternatively,  the  user  can  specify  the  relaxation  param¬ 
eter  separately  for  each  agent,  so  that  one  agent  may 
relax  after  10  processing  cycles  while  another  will  relax 
after  20  processing  cycles.  Problem-state  relaxation  oc¬ 
curs  because  the  problem  may  be  overconstrained  by  the 
full  agent  set.  The  ability  to  formulate,  communicate, 
and  assimilate  constraining  information  is  not  guaran¬ 
teed  to  be  complete  and  precise  across  the  agent  set  and 
the  reality  is  that  agents  can’t  always  determine  whether 
the  composite  search  space  is  overconstrained.  There¬ 
fore,  they  must  have  some  heuristic  method  (as  well  as 
the  deterministic  methods  above)  for  deciding  when  it  is 


appropriate  to  relax  requirements.'^  Because  of  problem- 
state  relaxation,  we  can  guarantee  that  if  any  initial  pro¬ 
posal  is  generated  that  can  result  in  a  feasible  solution, 
either  that  solution  will  eventually  become  acceptable  to 
all  agents,  or  some  other  solution  will  become  acceptable 
to  all  agents  and  deadlock  will  not  occur. 

4.4  Terminate-Search 

The  operator  terminate- search  is  applied  by  TEAM,  rather 
than  by  an  agent,  to  change  the  search  phase  of  the  algo¬ 
rithm  from  initial  to  closed  when  some  (user-specified) 
number  of  acceptable  proposals  been  found. ^  As  seen 
in  Figure  1,  when  this  change  occurs,  partial  and  com¬ 
plete  unacceptable  solutions  move  from  intermediate  to 
termination  states.  Any  partial  acceptable  solutions  are 
completed  however  to  ensure  that  good  partial  solutions 
are  not  abandoned. 

4.5  Assimilating  Information 

There  are  two  operators  associated  with  assimilating  in¬ 
formation  at  an  agent:  store-received-information  and 
retrieve-information.  Store-received-information  takes 
conflict  information  from  other  agents,  syntactically 
checks  to  see  if  the  information  already  exists  in  the  lo¬ 
cal  knowledge  base  and,  if  not,  stores  it  so  that  it  can 
be  retrieved.  A  received  requirement  may  be  indexed 
by  various  attributes  including  the  name  of  the  sending 
agent,  the  flexibility  of  the  requirement,  the  names  and 
acceptable  values  of  constrained  solution  attributes,  and, 
in  the  case  of  ordered  solution  attributes,  whether  the  re¬ 
quirement  defines  a  minimum  or  maximum  boundary  on 
potential  values,  e.g.,  x  >  b. 

Retrieve- information  is  an  operator  that  extends  or 
replaces  an  agent’s  default  capability  to  retrieve  rele¬ 
vant  constraining  information  from  its  knowledge  base. 
Because  an  agent’s  internal  knowledge  is  expected  to  be 
locally  consistent,  the  default  retrieval  mechanism  gen¬ 
erally  does  not  handle  cases  where  conflicts  may  exist 
in  the  retrieved  requirements.  Requirement  retrieval 
occurs  during  solution  initiation,  extension,  and  criti¬ 
cism.  The  goal  of  the  retrieval  process  is  to  And  the 
most  restrictive,  but  non-conflicting,  set  of  solution  re¬ 
quirements  that  constrain  a  solution  for  the  current  local 
search  problem.  Different  types  of  requirements  require 
different  treatment,  but  to  provide  a  concrete  example 
of  retrieval,  we  present  the  algorithm  used  for  selecting 
boundary  constraints  on  numerical  solution  attributes  in 
our  application  systems.  Potentially  relevant  constraints 
are  retrieved  and  sorted  into  maximum  and  minimum 
boundary  groups.  The  most  restrictive  maximum  con¬ 
straint  (MAX)  and  the  most  restrictive  minimum  con¬ 
straint  (MIN)  from  each  group  are  selected  (where  most 
restrictive  means  the  highest  value  from  the  MIN  group 
and  the  lowest  value  from  the  MAX  group).  Then  the 

^ Using  the  number  of  processing  cycles  as  a  heuristic  is 
a  simplistic  approach.  More  sophisticated  mechanisms  for 
applying  problem-state  relaxation  based  on  characteristics  of 
problem-solving  situation,  rather  than  on  time,  are  discussed 
in  [Lander,  1993). 

®  This  is  a  simplified  version  of  the  TEAM  termination  policy 
that  integrates  agent  acceptability  and,  optionally,  a  domain- 
dependent  global  evaluation  of  solutions. 


algorithm  loops  through  the  following  sequence  until  a 
set  of  minimum  and  maximum  values  is  found  or  until 
it  is  determined  that  no  non-conflicting  set  exists. 

LOOP;  If  the  value  of  MAX  is  greater  than  or  equal 
to  the  value  of  MIN,  return  MAX  and  MIN  since  a  non¬ 
conflicting  set  has  been  found.  Otherwise,  if  the  flexibil¬ 
ity  of  MAX  is  greater  than  the  flexibility  of  MIN  select 
the  next  most  restrictive  maximum  constraint  (MAX) 
and  go  to  LOOP.  Otherwise,  if  the  flexibility  of  MAX 
is  less  than  the  flexibility  of  MIN,  select  the  next  most 
restrictive  minimum  constraint  (MIN)  and  go  to  LOOP. 
Otherwise,  the  flexibility  of  MAX  is  equal  to  the  flexi¬ 
bility  of  MIN.  Then:  if  MAX  is  locally  owned,  select  the 
next  most  restrictive  minimum  constraint  (MIN)  and  go 
to  LOOP.  If  MAX  is  not  locally  owned  and  MIN  is  lo¬ 
cally  owned,  select  the  next  most  restrictive  maximum 
constraint  (MAX)  and  go  to  LOOP.  If  neither  MAX  nor 
MIN  is  locally  owned,  select  the  next  most  restrictive 
minimum  constraint  (MIN)  and  go  to  LOOP. 

In  reusable  agent  sets,  operator  diversity  is  expected — 
not  every  agent  will  instantiate  every  operator  including 
the  store-received-information  and  retrieve-information 
operators.  Because  of  this,  when  an  agent  formulates 
and  sends  conflict  information  to  another  agent,  there 
is  no  guarantee  that  the  receiving  agent  will  use  that 
information  appropriately.  Therefore,  although  conflict 
information  is  shared  willingly  and  cooperatively  in  ne¬ 
gotiated  search,  agents  do  not  depend  on  other  agents 
to  react  in  a  fixed  way  to  that  information. 

4.6  Agent-Level  Control  of  Operator 
Application 

Figure  1  describes  domain-independent  state  precondi¬ 
tions  that  must  be  satisfied  before  an  agent  can  apply 
one  of  its  operators  to  a  particular  solution.  However, 
because  there  are  multiple  solution  paths,  and  because 
some  operators  are  not  directly  involved  in  solution  gen¬ 
eration  (e.g.,  store-received-inf ormation) an  agent  may 
have  multiple  operators  ready  to  execute  at  any  given 
time.  The  order  in  which  an  agent  schedules  local  opera¬ 
tors  is  not  mandated  by  either  TEAM  or  by  the  negotiated- 
search  algorithm.  However,  because  an  agent’s  percep¬ 
tion  of  the  world  changes  over  time,  the  order  in  which 
particular  operators  are  executed  does  affect  system  per¬ 
formance  and  the  effect  of  local  scheduling  on  the  over¬ 
all  behavior  of  the  system  should  be  considered.  Some 
general  policies  for  local  scheduling  are  useful  in  most 
situations,  i.e.,  agents  should  assimilate  any  new  infor¬ 
mation  received  before  initiating  or  critiquing  solutions. 
The  degree  of  sophistication  required  in  local  scheduling 
though  is  highly  dependent  on  the  application  and  the 
complexity  of  required  interactions. 

5  Conclusions 

Negotiated  search  is  a  flexible  and  widely  applicable 
distributed-search  algorithm.  It  specifically  addresses 
issues  that  arise  in  multi-agent  systems  comprised  of 
reusable  and  heterogeneous  agents.  The  algorithm  ac¬ 
knowledges  the  inevitability  of  conflict  among  the  agents, 
and  exploits  that  conflict  to  drive  agent  interaction  and 
guide  local  search. 


Negotiated  search  has  been  implemented  in  TEAM,  a 
generic  framework  for  the  integration  of  reusable  agents, 
and  consequently,  in  two  application  systems  built  on  top 
of  TEAM:  STEAM  (a  seven-agent  system  for  the  mechani¬ 
cal  design  of  steam  condensers) ;  and  AGREE  (a  two-agent 
system  for  buy /sell  contract  negotiation).  Testing  and 
analysis  of  the  algorithm  within  the  context  of  the  appli¬ 
cation  systems  is  described  in  other  work  [Lander,  1993, 
Lander  and  Lesser,  1992a,  Lander  and  Lesser,  1992b]. 
Results  from  experiments  conducted  with  negotiated 
search  show  that  the  algorithm  can  produce  high-quality 
solutions.  They  also  support  the  claim  that  the  al¬ 
gorithm  is  flexible  enough  to  work  in  reus  able- agent 
systems  where  the  search  operators  are  randomly  dis¬ 
tributed  across  the  agent  set.  We  see  negotiated  search 
as  a  default  algorithm — one  that  will  provide  reason¬ 
able  solutions  in  a  reasonable  amount  of  time  without 
problem-specific  customization.  As  a  complementary 
approach  to  developing  this  general  algorithm,  we  are 
developing  customized  algorithms  that  require  specific 
agent  characteristics  or  inter-agent  relationships  to  exist. 
By  taking  advantage  of  these  characteristics,  it  is  often 
possible  to  improve  solution  quality  and/or  processing¬ 
time  performance.  TEAM  supports  the  dynamic  selection 
of  a  search  algorithm,  thereby  enabling  an  agent  set  to 
switch  to  a  customized  algorithm  if  the  requirements  for 
application  of  the  algorithm  are  met.  This  work  is  de¬ 
scribed  in  [Lander,  1993]. 
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Abstract 

In  this  paper,  we  study  the  problem  of  achieving  efficient  interaction  in  a  dis¬ 
tributed  scheduling  system  whose  scheduling  agents  may  borrow  resources  from 
one  another.  Specifically,  we  expand  on  Sycaras  use  of  resource  texture  measures  in 
a  distributed  scheduling  system  with  a  central  resource  monitor  for  each  resource 
type  and  apply  it  to  the  decentralized  case.  We  show  how  analysis  of  the  ab¬ 
stracted  resource  requirements  of  remote  agents  can  guide  an  agents  choice  of  local 
scheduling  activities  not  only  in  determining  local  constraint  tightness,  but  also 
in  identifying  activities  that  reduce  global  uncertainty.  We  also  exploit  meta-level 
information  to  allow  the  scheduling  agents  to  make  reasoned  decisions  about  when 
to  attempt  to  solve  impasses  locally  through  backtracking  and  constraint  relaxation 
and  when  to  request  resources  from  remote  agents.  Finally,  we  describe  the  current 
state  of  negotiation  in  our  system  and  discuss  plans  for  integrating  a  more  sophis¬ 
ticated  cost  model  into  the  negotiation  protocol.  This  work  is  presented  in  the 
context  of  the  Distributed  Airport  Resource  Management  System,  a  multi-agent 
system  for  solving  airport  ground  service  scheduling  problems. 

*This  work  was  partly  supported  by  DARPA  contract  N00014-92-J-1698  and  NSF  contracts  CDA- 
8922572  and  IRJ-9208920.  The  content  of  this  paper  does  not  necessarily  reflect  the  position  or  the 
policy  of  the  Government  and  no  official  endorsement  should  be  inferred. 
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1  Introduction 


The  problem  of  scheduling  resources  and  activities  is  known  to  be  extremely  challenging 
[8,7,  14,  11].  The  complexity  increases  when  the  scheduling  process  becomes  dependent 
upon  the  activities  of  other  concurrent  schedulers.  Such  interactions  between  scheduling 
agents  arise  when,  for  example,  agents  must  borrow  resources  from  other  agents  in 
order  to  resolve  local  impasses  or  improve  the  quality  of  a  local  solution.  Distributed 
scheduling  applications  are  not  uncommon,  for  example,  the  classic  meeting  planning 
problem  [13]  can  be  considered  as  a  distributed  scheduling  problem;  the  airport  ground 
service  scheduling  (AGSS)  problem  we  address  in  this  paper  is  another;  and  similar 
problems  may  arise  in  factory  floor  manufacturing  domains. 

In  distributed  scheduling  systems,  problem-solving  costs  will  likely  increase  because 
of  the  interaction  among  agents  caused  by  the  lending  of  resources.  One  method  of 
increasing  the  quality  of  solutions  developed  by  such  multi-agent  schedulers  and  mini¬ 
mizing  the  costs  of  backtracking  is  to  allow  agents  to  communicate  abstracted  versions 
of  their  resource  requirements  and  capabilities  to  other  agents.  The  use  of  this  meta-level 
information  allows  the  scheduling  agents  to  develop  models  of  potential  interactions 
between  their  scheduling  processes  and  those  of  other  agents,  where  an  interaction  is 
defined  as  a  time  window  in  which  the  borrowing  or  lending  of  a  resource  might  occur. 
We  show  how  the  identification  of  interactions  affects  the  choice  of  scheduling  heuristics, 
communication,  and  negotiation  policies  in  a  distributed  scheduling  system.  We  discuss 
our  heuristics  in  the  context  of  a  specific  testbed  application,  the  Distributed  Airport 
Resource  Management  System  (DiS-ARM). 

2  Related  Work 

The  use  of  meta-level  information  to  define  the  interactions  between  agents  has  been 
studied  extensively  by  Durfee  and  Lesser  via  the  use  of  partial  global  plans  [5].  This 
work  has  been  extended  by  Decker  and  Lesser  [3,  4]  to  incorporate  more  sophisticated 
coordination  relationships.  According  to  this  framework,  we  can  view  our  detection  of 
potential  loan  requests  using  texture  measures  to  be  an  identification  of  facilitating  rela¬ 
tionships,  and  our  modification  of  the  scheduling  algorithm  as  an  attempt  to  exploit  this 
perceived  relationship.  The  formulation  of  distributed  constraint  satisfaction  problems 
as  distributed  AI  was  described  by  Yakoo  [16],  however,  this  work  concentrated  more  on 
the  problem  of  distributed  backtracking  rather  than  on  coordinating  agents. 

The  problem  of  coordinating  distributed  schedulers  has  been  studied  extensively  by 
Sycara  and  colleagues  [15].  They  describe  a  mechanism  for  transmitting  abstractions  of 
resource  requirements  {textures)  between  agents.  Each  agent  uses  these  texture  measures 
to  form  a  model  of  the  aggregate  system  demand  for  resources.  This  model  is  used 
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to  allocate  resources  using  various  heuristics.  For  example,  a  least-constraining-value 
heuristic  is  used  to  allocate  resources  based  on  the  minimization  of  the  probablity  that 
the  reservation  would  conflict  with  any  other.  For  each  type  of  resource,  one  agent  is 
assigned  the  task  of  coordinating  allocations  and  determining  whether  requests  can  be 
satisfied.  All  resources  of  a  given  type  are  considered  interchangeable  and  the  centralized 
resource  monitor  does  not  need  to  perform  significant  planning  to  choose  the  most 
suitable  resource;  instead,  its  role  is  simply  to  ensure  that  each  resource  is  allocated  to  no 
more  agents  than  can  be  served  by  that  resource  during  any  given  time  period. 

We  investigate  a  similar  use  of  abstracted  resource  demands  for  a  case  in  which 
centralized  resource  monitors  are  not  possible  since  resources  of  the  same  type  may 
possess  unique  characteristics,  and  agents  possess  proprietary  information  about  local 
resources  (such  as  current  location  and  readiness).  Agents  may  respond  to  a  request 
for  a  resource  either  by  immediately  satisfying  it  with  a  reservation,  denying  it,  or  by 
performing  local  problem-solving  actions  to  attempt  to  produce  a  suitable  reservation. 

In  our  domain,  we  have  found  that  Sycaras  texture  measures  alone  do  not  convey 
sufficient  information  to  allow  satisfactory  scheduling.  Their  texture  measures  consist  of 
a  demand  profile  for  each  resource  which  represents,  for  each  time  interval,  the  sum  of 
probabilities  that  resource  requests  will  overlap  that  interval.  These  probabilities  are  based 
on  the  assumption  that  reservations  can  occur  at  any  time  within  the  requested  interval. 
Assignment  of  resources  is  then  performed  using  these  probabilities  to  implement  a 
least-constraining-value  heuristic. 

These  texture  measures  do  not  capture  sufficient  information  regarding  time-shift 
preferences  of  resource  assignments  within  the  specified  interval.  In  our  domain,  re¬ 
sources  may  legally  be  assigned  at  any  time  within  the  interval  between  the  earliest  start 
time  and  the  latest  finish  time,  but  for  some  activities,  there  exist  strong  preferences  as 
to  which  end  of  the  interval  the  assignment  is  biased.  For  example,  when  scheduling 
ground  services  for  an  airport,  once  a  flight  arrives,  it  is  important  to  unload  baggage  as 
early  as  possible  so  that  necessary  transfers  can  be  made  to  connecting  flights.  The  shift 
preference  can  be  determined  by  the  assigning  agent  using  domain  knowledge,  provided 
that  it  knows  the  nature  of  the  task  generating  the  request.  Because  this  information  is 
not  captured  in  the  texture  measures,  the  heuristic  described  by  Sycara,  et  al.  is  likely  to 
lead  to  poor  schedules  within  the  airport  ground  service  scheduling  domain. 

3  Overview:  The  Distributed  Dynamic  Scheduling  System 

In  order  to  test  our  approach  to  solving  distributed  resource-constrained  scheduling  prob¬ 
lems  (RCSPs),  we  have  designed  a  distributed  version  of  a  reactive,  knowledge-based 
scheduling  system  called  DSS  (the  Dynamic  Scheduling  System)  [9].  DSS  provides  a 
foundation  for  representing  a  wide  variety  of  real-world  RCSPs.  Its  flexible  scheduling 
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approach  is  capable  of  reactively  producing  quality  schedules  within  dynamic  environ¬ 
ments  that  exhibit  unpredictable  resource  and  order  behavior.  Additionally,  DSS  is 
equipped  to  manage  the  scheduling  of  shared  tasks  connecting  otherwise  separate  orders, 
and  handle  RCSPs  that  involve  mobile  resources  with  significant  travel  requirements. 

DSS  is  implemented  as  an  agenda-based  blackboard  system  [6,  1]  using  GBB  (the 
Generic  Blackboard  System)  [2].  It  maintains  a  blackboard  structure  upon  which  a 
developing  schedule  is  constructed,  and  where  the  sets  of  orders  and  resources  for  a 
particular  RCSP  are  stored.  A  group  of  knowledge  sources  are  provided  for  securing  the 
necessary  resource  reservations.  These  knowledge  sources  are  triggered  as  the  result  of 
developments  on  the  blackboard,  namely  the  creation  and  modification  of  the  service 
goals  attached  to  all  resource-requiring  tasks.  Triggered  knowledge  sources  are  placed 
onto  an  agenda  and  executed  in  the  order  of  their  priority. 

The  Distributed  Dynamic  Scheduling  System  (DiS-DSS)  maintains  separate  black¬ 
board  structures  for  each  agent  and  provides  communication  utilities  for  transmitting 
requests  and  meta-level  information  between  agents.  Remote  analogues  of  service  goals, 
task  structures,  and  other  scheduling  entities  are  created  as  needed  to  model  the  state 
of  other  agents.  The  information  about  other  agents  schedules  and  commitments  is 
incomplete  and  is  limited  to  the  content  of  goals,  meta-level  information,  and  those 
parts  of  the  schedule  to  which  the  local  agent  itself  has  contributed. 

The  approach  we  have  taken  towards  distributing  DSS  is  to  view  each  agent  as  repre¬ 
senting  an  autonomous  organization  possessing  its  own  resources.  It  is  this  autonomous 
nature  of  the  organizations  that  is  the  rationale  for  distributing  the  resource  allocation 
problem.  Although  a  centralized  architecture  might  produce  more  efficient  solutions, 
real  world  considerations  such  as  cost  and  ownership  often  lead  to  confederations  in 
which  information  transfer  regarding  commitments  and  capabilities  is  limited.  In  this 
model,  the  primary  relationship  between  agents  is  a  commitment  to  exchange  resources  as 
needed  and  a  willingness  to  negotiate  with  other  agents  to  resolve  impasses.  This  model  of 
a  decentralized  group  of  agents  performing  independent  tasks  in  a  resource-constrained 
environment  is  similar  to  the  architecture  of  Moehlmans  Distributed  Fireboss  [10].  We 
distinguish  our  work  from  Moehlmans  by  our  use  of  meta-level  information  to  con¬ 
trol  the  decision  process  by  which  agents  choose  to  resolve  impasses  locally,  through 
backtracking  and  constraint  relaxation,  or  through  requests  to  remote  agents. 

Because  resources  are  owned  by  specific  agents  and  possess  unique  characteristics 
regarding  location  and  travel  times  that  are  known  only  to  the  owning  agent,  we  can  not 
define  central  resource  monitors  responsible  for  allocating  each  type  of  resource.  This, 
again,  distinguishes  our  approach  from  that  of  Sycara,  et  al.  [15].  Agents  requiring  a 
resource  must  communicate  directly  with  the  agent  owning  a  resource  of  that  type  and 
negotiate  for  its  loan. 

This  architecture  provides  a  rich  domain  for  the  study  of  agent  coordination  issues 
in  a  distributed  environment;  agents  must  be  able  to  model  the  interactions  of  their  tasks 
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with  those  of  neighboring  agents  closely  enough  to  be  able  to  determine  which  agents 
will  be  most  likely  to  provide  the  desired  resources  at  the  lowest  cost  to  both  agents. 
This  coordination  requires  local  reasoning  on  the  part  of  agents  in  order  to  determine 
how  to  cooperate  efficiently  with  an  acceptable  level  of  communication  and  redundant 
computation. 

3.0.1  Assumptions 

In  our  work  with  DiS-DSS,  we  have  made  a  number  of  assumptions  about  the  nature  of 
agents,  schedules,  and  communication  overheads. 

•  Agents  are  cooperative  and  will  lend  a  resource  if  it  is  available. 

•  Agents  will  only  request  a  resource  from  one  agent  at  a  time  -  this  is  to  avoid 
the  possibility  of  redundant  computation  and  communication  if  multiple  agents 
attempt  to  provide  the  resource  cf  [12]. 

•  Once  agents  have  lent  a  resource  to  another  agent,  they  will  never  renege  on  this 
agreement.  This  limits  the  ability  of  the  system  to  perform  global  backtracking; 
we  intend  to  eliminate  this  restriction  in  the  next  version  of  the  system. 

•  Communication  is  asynchronous  and  can  occur  at  any  point  during  the  con¬ 
struction  of  a  local  schedule;  therefore  requests  may  arrive  before  an  agent  has 
completely  determined  its  own  requirements  for  resources  in  the  time  window  of 
interest. 

•  The  cost  of  messages  is  largely  in  the  processing  and  in  the  inherent  delay  caused 
by  transmission  -  the  amount  of  data  within  the  message  may  be  large,  within 
limits. 

3.0.2  Communication  of  Abstract  Resource  Profiles 

Without  information  regarding  other  agents’  abilities  to  supply  missing  resources,  an 
agent  may  be  unable  to  complete  a  solution,  or  may  be  forced  to  compromise  the  quality 
of  its  solution.  To  allow  agents  to  construct  a  model  of  global  system  constraints  and 
capabilities,  we  have  developed  a  protocol  for  the  exchange  and  updating  of  resource 
profiles:  summarizations  of  the  agents  committed  resources,  available  resources,  and 
estimated  future  demand. 

Upon  startup,  each  agent  in  DiS-DSS  receives  a  set  of  orders  to  be  processed. 
The  agents  examine  these  orders  and  generate  an  abstract  description  of  their  resource 
requirements  for  the  scheduling  period.  This  bottleneck-status-list  consists  of  a  list  of 
intervals,  with  each  interval  annotated  by  a  triple:  resources  in  use,  resources  requested. 
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and  resources  available.  The  request  field  of  this  triplet  represents  an  abstraction  of 
the  agents  true  resource  requirements.  Certain  aspects  of  a  reservation  such  as  mobile 
resource  travel  times  to  the  objects  to  be  serviced,  cannot  be  easily  estimated  in  advance. 
The  time  intervals  specified  for  each  resource  request  are  pessimistic,  consisting  of  the 
earliest  possible  start  time  and  latest  possible  finish  times  for  the  activity  requesting  that 
resource.  The  true  duration  of  the  task  can  be  estimated  by  the  scheduling  agent  using 
its  domain  knowledge  regarding  the  typical  time  required  to  perform  a  task.  We  define 
the  demand  for  a  resource  r  performing  task  T  in  interval  {tj,  tk)  to  be: 

avg_demand(r,r,  =  duration(T,  r)/(fi.  —  tj) 

Once  resource  abstractions  have  been  developed  for  each  resource  type  required  (or 
possessed)  by  the  agent,  it  transmits  its  abstractions  to  all  other  agents.  Likewise,  it 
receives  abstractions  from  all  agents.  Once  the  agent  has  received  communications  from 
all  other  agents,  it  prepares  a  map  of  global  resource  requirements  and  uses  it  to  generate 
a  set  of  data  structures  called  lending  possibilities.  Each  lending  possibility  represents  an 
interval  in  which  some  agent  appears  to  have  a  shortfall  in  a  resource.  For  each  lending 
possiblity,  the  agent  generates  a  list  of  possible  lenders  for  that  resource,  based  on  the 
global  resource  map  and  its  knowledge  of  its  own  resource  requirements.  These  lending 
possibility  structures  are  used  to  predict  when  remote  agents  may  request  resources  and 
when  the  local  agent  may  need  to  borrow  resources.  This  information  guides  the  agents 
decision-making  process  in  determining  both  when  to  process  local  goals  and  when  and 
from  whom  to  request  resources. 

3.1  The  Distributed  Airport  Resource  Management  System 

The  Distributed  Airport  Research  Management  System  testbed  was  constructed  using 
DiS-DSS  to  study  the  roles  of  coordination  and  negotiation  in  a  distributed  problem- 
solver.  DiS-ARM  solves  distributed  AGSS  problems  where  the  function  of  each  schedul¬ 
ing  agent  is  to  ensure  that  each  flight  for  which  it  is  responsible  receives  the  ground 
servicing  (gate  assignment,  baggage  handling,  catering,  fuel,  cleaning,  etc.)  that  it  re¬ 
quires  in  time  to  meet  its  arrival  and  departure  deadlines.  The  supplying  of  a  resource  is 
usually  a  multi-step  task  consisting  of  setup,  travel,  and  servicing  actions.  Each  resource 
task  is  a  subtask  of  the  airplane  servicing  supertask.  There  is  considerable  parallelism 
in  the  task  structure:  many  tasks  can  be  done  simultaneously.  However,  the  choice 
of  certain  resource  assignments  can  often  constrain  the  start  and  end  times  of  other 
tasks.  For  example,  selection  of  a  specific  arrival  gate  for  a  plane  may  limit  the  choice 
of  servicing  vehicles  due  to  transit  time  from  their  previous  servicing  locations  and  may 
limit  refueling  options  due  to  the  presence  or  lack  of  underground  fuel  tanks  at  that  gate. 
For  this  reason,  all  resources  of  a  specific  type  can  not  be  considered  interchangeable  in 
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the  AGSS  domain.  Only  the  agent  that  owns  the  resource  can  identify  all  the  current 
constraints  on  that  resource  and  decide  whether  or  not  it  can  be  allocated  to  meet  a 
specific  demand. 

4  Exploiting  Meta-level  Information  in  DiS-DSS 

In  this  section,  we  examine  three  areas  in  which  meta-level  abstractions  of  global  resource 
requirements  are  exploited  in  DiS-DSS.  We  show  how  the  goal  rating  scheme  of  an  agents 
blackboard-based  scheduler  is  modified  to  satisfy  the  twin  aims  of  scheduling  based  on 
global  constraints  and  of  planning  activities  in  order  to  reduce  uncertainty  about  agent 
interactions.  We  describe  how  communication  of  resource  abstractions  is  based  on 
models  of  agents’  interests  and  the  manner  in  which  agents  choose  between  local  and 
remote  methods  of  satisfying  a  request. 

4. 1  Scheduling  using  Texture  Measures 

Many  scheduling  systems  divide  processing  into  the  categories  of  variable  selection,  the 
choice  of  the  next  activity  to  schedule,  and  value  selection,  the  selection  of  a  resource  and 
time  slot  for  that  activity.  In  DiS-DSS,  variable  selection  corresponds  to  the  satisfaction 
of  a  particular  resource  request.  Value  selection  is  handled  in  DSS  by  a  collection  of 
opportunistic  scheduling  heuristics.  We  focus  here  on  the  problem  of  coordinating 
resource  requests  so  that  local  variable-selection  heuristics  possess  sufficient  information 
to  make  informed  decisions. 

In  many  knowledge-based  scheduling  systems,  the  object  of  control  is  to  arrange 
scheduling  activities  so  that  the  most  tightly  constrained  activities  are  scheduled  first  in 
order  to  reduce  the  need  for  backtracking.  In  a  distributed  system,  we  have  an  additional 
criterion:  to  schedule  problem-solving  activities  in  such  a  way  that  global  uncertainty 
about  certain  tasks  is  reduced  before  decisions  regarding  those  tasks  are  made.  A  scheduler 
may  be  uncertain  of  whether  other  agents  will  request  a  resource  in  a  tightly  constrained 
time  period  and  whether  other  agents  will  be  able  to  supply  a  needed  resource.  While 
the  resource  abstractions  may  indicate  a  loan  request  is  likely,  the  duration  of  the  loan 
and  details  of  the  resource’s  destination  can  only  be  determined  once  the  request  has 
been  received.  Likewise,  details  of  the  precise  timing  and  duration  of  a  loan  can  only  be 
determined  upon  receipt  of  a  remote  reservation.  We  have  added  coordination  heuristics 
to  the  agenda  scheduler  of  DiS-DSS  whose  purpose  is  to  promote  problematic  activities 
in  each  agent’s  scheduling  queue  so  that  their  early  execution  will  reduce  uncertainty 
about  global  system  requirements. 

In  the  DiS-DSS  blackboard-based  architecture,  tasks  which  require  resources  generate 
service-goals.  Requests  received  from  remote  agents  generate  remote-service-goals.  Each 
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goal  stimulates  knowledge  sources  that  act  to  secure  an  appropriate  resource.  The  order 
of  execution  of  knowledge  sources  depends  on  the  rating  of  the  stimulating  goals.  Goals 
are  rated  using  a  basic  most-tightly-constrained-firsf  opportunistic  heuristic.  The  goals 
are  then  stratified  according  to  the  following  scheme,  with  the  uppermost  levels  receiving 
the  highest  priority  and  contention  within  each  level  being  resolved  according  to  the 
basic  rating  heuristic. 

1 .  Tightly  constrained  goals  that  may  not  be  satisfiable  locally  or  that  can  only  be 
satisfied  by  a  borrowing  event  and  remote  requests  that  do  not  overlap  any  local 
request. 

2.  Tightly  constrained  goals  that  can  only  be  satisfied  locally. 

3.  Goals  representing  requests  from  remote  agents  that  overlap  local  goals. 

4.  Unconstrained  or  loosely  constrained  tasks. 

5.  Goals  that  potentially  overlap  with  tasks  of  remote  agents. 

A  goal  g  is  considered  to  be  tightly  constrained  in  interval  if  there  exists  a 

time  within  that  interval  such  that  for  each  resource  type  r  that  can  satisfy  g,  the  number 
of  unreserved  resources  is  less  than  the  sum  of  the  average  demand  for  all  outstanding 
goals. 


V  r  s.t.  Sat(^,  r)3  t  E  {tj^  4)  s.t. 

^available{r,t)  <  '^3.vg.demznd{task{g),r,tj,tk) 
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A  goal  potentially  overlaps  with  a  task  of  a  remote  agent  if  there  exists  a  lending- 
possibility  data  structure  for  that  remote  agent  describing  a  potential  shortfall  within  the 
time  interval  spanned  by  that  goal  for  some  resource  type  that  could  satisfy  the  goal. 

The  rationale  for  this  goal  ordering  is  as  follows.  Goals  that  can  not  be  satisfied  locally 
must  be  transmitted  to  remote  agents.  The  transmission  of  a  goal  conveys  considerably 
more  information  than  is  available  in  the  resource  texture  profiles.  The  potential  lending 
agent  will  therefore  have  more  accurate  information  regarding  the  interval  for  which  the 
resource  is  desired  and  the  preferred  shift  preference  for  the  reservation  in  that  interval 
(early  or  late).  Once  it  has  received  the  goal,  it  will  be  able  to  make  more  informed 
decisions  about  the  tightness  of  constraints  for  both  the  local  and  remote  goals.  If  the 
agent  is  able  to  satisfy  the  remote  goal,  it  will  be  able  to  update  its  resource  demand 
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curve  and  transmit  it  to  other  agents  who  may  also  have  been  potential  lenders  of  that 
resource.  For  all  these  reasons,  early  transmittal  and  satisfaction  of  remote  service  goals 
is  desirable. 

Tightly  constrained  goals  that  potentially  overlap  remote  requests  are  deferred  until 
some  overlapping  goal  arrives,  or  until  a  resource  update  arrives  indicating  that  the  remote 
agent  no  longer  requires  that  resource,  or  until  no  other  work  is  available  for  the  agent 
to  perform.  By  deferring  goals  until  more  information  about  interactions  is  available, 
the  system  can  avoid  making  premature  decisions  while  at  the  same  time  working  on 
unrelated  or  less  constrained  tasks.  Once  a  request  arrives,  conflicts  for  resources  can  be 
arbitrated  according  to  which  goal  is  most  pressing  and  least  conducive  to  backtracking 
and/or  constraint  relaxation. 

There  are  a  number  of  competing  requirements  for  the  rating  and  processing  of 
remote  service  goals.  One  would  like  to  process  a  remote  service  goal  as  soon  as  possible 
in  order  to  return  information  to  the  requesting  agent.  At  the  same  time,  both  local  and 
remote  service  goals  requesting  the  same  type  of  resource  should  be  rated  according  to 
the  same  constraint  tightness  heuristics.  The  goal  rating  function  in  DiS-DSS  attempts 
to  satisfy  these  requirements  by  prioritizing  those  remote  service  goals  that  do  not 
overlap  any  local  service  goals  and  by  mapping  overlapping  remote  service  goals  onto 
the  same  priority  level  as  those  local  goals  that  they  overlap.  Note  that  the  'overlapping” 
relationship  is  transitive:  if  the  priority  of  a  goal  is  reduced  while  waiting  for  a  remote 
request,  any  lower  rated  goal  that  overlaps  that  goals  time  interval  must  also  wait  even 
though  it  may  not  directly  overlap  the  interval  of  the  potential  remote  request. 

4.2  Guiding  Communication  using  Texture  Measures 

Reducing  communication  costs  is  an  important  issue  in  distributed  systems.  For  this 
reason,  DiS-DSS  agents  use  the  lending  possibility  models  of  agent  interactions  to  guide 
communication  activities.  When  its  resource  requirements  change,  an  agent  transmits 
the  information  about  the  resource  type  only  to  those  agents  who,  based  on  its  local 
information,  would  be  interested  in  receiving  updates  concerning  that  resource  type.  An 
agent  with  no  surplus  resources  of  a  given  type  may  not  be  interested  if  the  local  agent 
increases  its  need  for  a  particular  resource,  likewise,  an  agent  with  a  surplus  of  a  particular 
resource  may  not  need  to  be  notified  if  an  agent  reduces  its  demand  for  that  resource 
type.  However,  agents  who  possess  shortfalls  in  a  time  interval  for  a  particular  type  of 
resource  will  receive  updates  during  processing  whenever  an  agent  increases  the  precision 
of  its  resource  abstractions  by  securing  or  releasing  a  resource. 

The  use  of  local  knowledge  to  guide  communication  episodes  may  lead  to  agents’ 
knowledge  of  the  global  state  of  the  system  becoming  increasingly  out  of  date.  The 
degree  to  which  this  should  be  allowed  to  happen  is  dependent  upon  the  acceptable  level 
of  uncertainty  in  the  system  and  the  accuracy  with  which  resource  abstractions  can  be 
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made. 


4.3  Ordering  Methods  for  Achieving  Resource  Assignments 

In  DSS,  the  process  of  securing  a  resource  is  achieved  through  a  series  of  increasingly  costly 
methods:  assignment,  preemption,  and  right  shifting.  These  correspond  roughly  to 
request  satisfaction,  backtracking,  and  constraint  relaxation.  Preemption  is  a  conservative 
form  of  backtracking  in  which  existing  reservations  are  preempted  in  favor  of  a  more 
constrained  task.  Right  shifting  satisfies  otherwise  intractable  requests  by  shifting  the  time 
interval  of  the  reservation  downstream  (later)  until  a  suitable  resource  becomes  available. 
Because  this  method  relaxes  the  latest  finish  time  constraint,  it  has  the  potential  to 
seriously  decrease  the  quality  of  a  solution.  In  the  AGSS  domain,  for  example,  right 
shifting  a  reservation  may  result  in  late  departures. 

In  DSS,  methods  are  ordered  according  to  increasing  cost.  In  the  distributed  version 
of  the  system,  the  choice  and  ordering  of  methods  is  more  complex.  When  an  agent 
cannot  immediately  acquire  a  resource  locally,  it  faces  a  decision:  should  it  perform 
backtracking  or  constraint  relaxation  locally,  communicating  only  when  it  has  exhausted 
all  local  alternatives,  or  should  it  immediately  attempt  to  borrow  the  resource  from 
another  agent?  The  decision-making  process  becomes  even  more  difficult  if  we  allow 
requests  from  remote  agents  to  take  precedence  over  local  requirements  such  that  agents 
may  have  to  perform  backtracking  or  constraint  relaxation  in  order  to  satisfy  a  remote 
request.  We  consider  this  last  decision  process  a  form  of  negotiation^  because  it  involves 
determining  which  of  two  agents  should  bear  the  cost  of  reduced  solution  quality  and/or 
increased  problem-solving  effort. 

In  DiS-DSS,  we  use  the  lending  possibility  data  structures  to  dynamically  generate 
plans  for  achieving  each  resource  assignment.  When  it  appears  that  a  remote  agent  will 
have  surplus  resources  at  the  necessary  time,  then  the  agent  will  generate  a  request  as  soon 
as  it  becomes  clear  that  the  resource  can  not  be  secured  locally.  If,  however,  it  appears  that 
the  resource  is  tightly  constrained  globally,  the  agent  will  choose  to  perform  backtracking 
and/or  constraint  relaxation  operations  locally  rather  than  engage  in  communication 
episodes  that  will  probably  prove  futile. 

One  use  of  meta-information  occurs  during  the  planning  for  constraint  relaxation. 
The  scheduling  agent  attempts  to  minimize  the  magnitude  of  the  right  shift  in  order  to 
reduce  the  effect  of  the  constraint  relaxation  on  the  quality  of  the  solution.  To  do  this, 
the  agent  must  determine  whether  the  minimum  right  shift  can  be  achieved  locally  or 
remotely.  However,  requiring  agents  to  submit  bids  detailing  their  earliest  reservations 
for  a  given  resource  would  be  a  costly  process.  Instead,  the  agent  uses  the  abstractions  of 
remote  resource  availability  to  generate  a  threshold  value  for  the  right  shift  delay.  If  this 
value  is  less  than  the  delay  achieved  through  right  shifting  locally,  the  agent  sequentially 
transmits  the  resource  request  to  the  appropriate  remote  agents.  If  a  remote  agent 
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can  provide  a  reservation  w^ith  a  delay  of  less  than  or  equal  to  the  threshold  value,  it 
immediately  secures  the  resource.  Otherw’ise,  it  returns  the  delay  of  the  earliest  possible 
reservation.  If  no  reservation  is  found,  the  local  agent  sets  the  threshold  to  the  earliest 
possible  value  returned  by  some  remote  agent.  This  new  threshold  is  then  compared  to 
the  current  best  local  delay  (which  might  have  changed  due  to  local  scheduling  while  the 
remote  requests  were  being  processed).  This  process  continues  until  a  reservation  is  made 
or  until  the  threshold  becomes  greater  than  the  delay  achievable  by  right  shifting  locally. 
Obviously,  the  better  the  initial  estimate  for  the  delay  threshold,  the  less  communication 
activities  will  be  required. 

The  meta-information  is  also  used  to  determine  the  order  in  which  agents  should 
be  asked  for  resources,  beginning  with  the  agent(s)  with  the  least  tightly  constrained 
resources. 


5  Experimental  Results 

The  performance  of  the  mechanisms  that  we  have  developed  for  DiS-DSS  were  tested  in 
a  series  of  experiments  using  a  single  agent  system  as  a  basis  for  comparison.  We  used  six 
scenarios  designed  to  test  the  performance  of  the  system  in  tightly  constrained  situations. 
The  number  of  orders  in  each  scenario  ranged  from  10  to  60  and  a  minimal  set  of 
resources  was  defined  for  each  scenario.  Each  scenario  was  distributed  for  a  three  agent 
case.  Orders  were  assigned  to  each  agent  on  a  round-robin  basis  such  that  each  agent 
would  perform  approximately  the  same  amount  of  work.  Resources  were  distributed 
randomly  so  that  in  some  cases  each  agent  would  possess  all  necessary  resources  while  in 
other  cases,  borrowing  from  remote  agents  would  be  necessary. 

We  ran  DiS-ARM  on  each  scheduling  scenario  using  the  following  configurations  of 
the  scheduler: 

•  The  baseline  case  with  a  single  agent. 

•  The  3  agent  case  with  no  use  of  meta-level  information,  and  an  opportunistic 
(most-tightly-constrained-variable-first)  goal  rating  scheme 

•  The  3  agent  case  using  the  heuristic  goal  rating  scheme  incorporating  meta-level 
information  but  requesting  resources  from  remote  agents  only  when  all  local 
methods  have  failed. 

•  The  3  agent  case  using  heuristic  goal  rating,  meta-level  information,  and  dy¬ 
namic  reordering  of  resource  acquisition  methods  to  account  for  the  probability 
of  securing  a  goal  either  locally  or  remotely. 

For  each  run,  we  recorded  the  average  tardiness  of  the  schedule,  the  number  of  failed 
goals  (if  any),  the  number  of  resource-securing  methods  tried,  the  number  of  requests,  the 
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number  of  satisfied  remote  service  goals,  and  the  number  of  communication  episodes  that 
occurred  during  problem  solving.  In  each  case,  we  assumed  that  communication  costs 
were  negligible  in  relation  to  problem-solving  and  that  requests  and  resource  constraint 
updates  would  be  received  on  the  simulation  cycle  immediately  succeeding  the  one  in 
which  they  were  sent. 

Because  of  the  small  number  of  test  cases  we  have  examined  in  our  preliminary 
experiments,  we  present  our  results  anecdotally.  As  expected,  the  distributed  version  of 
the  scheduler  always  produces  a  schedule  of  somewhat  lower  quality  than  the  centralized 
one.  When  the  opportunistic  scheduler  of  the  centralized  version  is  used  for  scheduling 
in  a  distributed  environment,  its  lack  of  information  about  global  constraints  causes  it 
to  produce  somewhat  inferior  results.  The  heuristic  incorporating  meta-level  informa¬ 
tion  consistently  outperforms  the  opportunistic  scheduler  in  terms  of  the  number  of 
tardy  tasks.  The  opportunistic  scheduler  occasionally  will  produce  a  schedule  with  less 
total  tardiness  than  the  distributed  algorithm.  We  interpret  this  as  a  trade-off  between 
satisfying  global  requirements  (by  delaying  certain  goal  satisfactions  until  remote  infor¬ 
mation  becomes  available)  and  satisfying  local  requirements  by  producing  needed  results 
promptly  This  is  an  interesting  trade-off  that  we  intend  to  study  in  depth.  Attempting 
to  always  solve  problems  locally  using  preemption  and  constraint  relaxation  produced 
schedules  with  much  greater  delays  than  when  agents  dynamically  determined  when  to 
request  resources  remotely  based  on  the  meta-level  resource  abstractions. 

6  Conclusions  and  Future  Work 

The  work  we  have  performed  with  DiS-DSS  is  preliminary,  but  promising.  Our  results 
indicate  that  the  idea  of  using  meta-level  information  to  schedule  activities  in  order  to 
reduce  local  uncertainty  about  global  constraints  results  in  better  coordination  between 
agents  with  a  subsequent  increase  in  goal  satisfaction.  We  have  also  demonstrated  that 
meta-level  information  can  be  successfully  used  to  guide  the  choice  between  satisfying 
goals  locally  and  remotely,  and  in  optimizing  the  choice  of  agents  from  which  to  request 
resources. 

Our  experiments  were  performed  with  each  agents  orders  being  defined  statically 
before  scheduling.  This  allowed  the  agents  to  develop  a  model  of  their  predicted  resource 
requirements  before  scheduling  began.  If  we  were  to  model  a  system  in  which  orders 
changed  dynamically,  either  due  to  equipment  failures  or  timetable  changes,  we  would 
expect  the  model  of  global  resource  requirements  to  become  increasingly  inaccurate.  We 
would  like  to  understand  the  implications  of  allowing  jobs  to  arrive  dynamically  on  the 
performance  of  a  distributed  system  using  meta-level  information. 

As  well  as  continuing  to  explore  the  role  of  meta-level  resource  abstractions,  we 
plan  to  use  the  DiS-DSS  testbed  to  explore  a  number  of  important  issues  in  distributed 
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scheduling.  One  of  our  primary  goals  is  to  expand  the  idea  of  negotiation  between  agents 
that  we  have  touched  upon  in  this  paper.  Because  the  airport  ground  service  scheduling 
domain  represents  a  “real  world”  scenario,  we  are  able  to  create  a  meaningful  cost  model 
involving  not  only  the  delay  in  each  schedule,  but  the  probable  cost  of  that  delay  in  terms 
of  missed  connections.  By  allowing  agents  to  exchange  this  information  when  requesting 
resources,  they  will  be  able  to  more  meaningfully  weigh  the  importance  of  local  tasks 
against  the  quality  of  the  global  solution. 
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Abstract 

In  this  paper  we  discuss  a  number  of  previously 
unaddressed  issues  that  arise  in  automated  ne¬ 
gotiation  among  self-interested  agents  whose 
rationality  is  bounded  by  computational  com¬ 
plexity.  These  issues  are  presented  in  the  con¬ 
text  of  iterative  task  allocation  negotiations. 
First,  the  reasons  why  such  agents  need  to 
be  able  to  choose  the  stage  and  level  of  com¬ 
mitment  dynamically  are  identified.  A  pro¬ 
tocol  that  allows  such  choices  through  condi¬ 
tional  commitment  breaking  penalties  is  pre¬ 
sented.  Next,  the  implications  of  bounded  ra¬ 
tionality  are  analyzed.  Several  tradeoffs  be¬ 
tween  allocated  computation  and  negotiation 
benefits  and  risk  are  enumerated,  and  the  ne¬ 
cessity  of  explicit  local  deliberation  control  is 
substantiated.  Techniques  for  linking  negoti¬ 
ation  items  and  multiagent  contracts  are  pre¬ 
sented  as  methods  for  escaping  local  optima  in 
the  task  allocation  process.  Implementing  both 
methods  among  self-interested  bounded  ratio¬ 
nal  agents  is  discussed.  Finally,  the  problem 
of  message  congestion  among  self-interested 
agents  is  described,  and  alternative  remedies 
are  presented. 

1  Introduction 

The  importance  of  automated  negotiation  systems  is 
likely  to  increase  [Office  of  Technology  Assesment 
(OTA),  1994].  One  reason  is  the  growth  of  a  fast  and 
inexpensive  standardized  communication  infrastructure 
(EDI,  Nil,  KQML  [Finin  et  al,  1992],  Telescript  [General 
Magic,  Inc.,  1994]  etc.),  over  which  separately  designed 
agents  belonging  to  different  organizations  can  interact 
in  an  open  environment  in  real-time,  and  safely  carry  out 
transactions  [Kristol  et  al.^  1994;  Sandholm  and  Lesser, 
1995d].  Secondly,  there  is  an  industrial  trend  towards 
agile  enterprises:  small,  organizational  overhead  avoid¬ 
ing  enterprises  that  form  short  term  alliances  to  be  able 

*This  research  was  supported  by  ARPA  contract  N00014- 
92- J- 1698.  The  content  does  not  necessarily  reflect  the  po¬ 
sition  or  the  policy  of  the  Government  and  no  official  en¬ 
dorsement  should  be  inferred.  T.  Sandholm  also  funded  by  a 
University  of  Massachusetts  Graduate  School  Fellowship,  Leo 
and  Regina  Wainstein  Foundation,  Heikki  and  Hilma  Honka- 
nen  Foundation,  and  Ella  and  George  Ehmrooth  Foundation. 


to  respond  to  larger  and  more  diverse  orders  than  they 
individually  could.  Such  ventures  can  take  advantage  of 
economies  of  scale  when  they  are  available,  but  do  not 
suffer  from  diseconomies  of  scale.  This  concept  paper  ex¬ 
plores  the  implications  of  performing  such  negotiations 
where  agents  are  self-interested  (SI)  ^  and  must  make 
negotiation  decisions  in  real-time  with  bounded  or  costly 
computation  resources. 

We  cast  such  negotiations  in  the  following  domain 
independent  framework.  Each  agent  has  a  (possibly 
empty)  set  of  tasks  and  a  (possibly  empty)  set  of  re¬ 
sources  it  can  use  to  handle  tasks.  These  sets  change  due 
to  domain  events,  e.g.  new  tasks  arriving  or  resources 
breaking  down.  The  agents  can  subcontract  tasks  to 
other  agents  by  paying  a  compensation.  This  subcon¬ 
tracting  process  can  involve  breaking  a  task  into  a  num¬ 
ber  of  subtasks  handled  by  different  agents,  or  clustering 
a  number  of  tasks  into  a  supertask.  A  task  transfer  is 
profitable  from  the  global  perspective  if  the  contractee 
can  handle  the  task  less  expensively  than  the  contrac¬ 
tor,  or  if  the  contractor  cannot  handle  it  at  aU,  but  the 
contractee  can.  So,  the  problem  has  two  levels:  a  global 
task  allocation  problem,  and  each  agent’s  local  combina¬ 
torial  optimization  problem  defined  by  the  agent’s  cur¬ 
rent  tasks  and  resources.  The  goal  of  each  agent  is  to 
maximize  its  payoff  which  is  defined  as  its  income  minus 
its  costs.  Income  is  received  for  handling  tasks,  and  costs 
are  incurred  by  using  resources  to  handle  the  tasks.  We 
restrict  ourselves  to  domains  where  the  feasibility  and 
cost  of  handling  a  task  do  not  depend  on  what  other 
agents  do  with  their  resources  or  how  they  divide  tasks 
among  themselves,  but  do  depend  on  the  other  tasks  that 
the  agent  has  The  global  solution  can  be  evaluated 
from  a  social  welfare  viewpoint  according  to  the  sum  of 
the  agents’  payoffs. 

Reaching  good  solutions  for  the  global  task  allocation 
problem  is  difficult  with  SI  agents,  e.g.  because  they 
may  not  truthfully  share  all  information.  The  problem 
is  further  complicated  by  the  agents’  bounded  rational¬ 
ity:  local  decisions  are  suboptimal  due  to  the  inability 

^In  domains  where  agents  represent  different  real  world 
organizations,  each  agent  designer  wiU  want  its  agent  to  do 
as  well  as  it  can  without  concern  for  other  agents.  Conversely, 
some  domains  are  inherently  composed  of  benevolent  agents. 
For  example,  in  a  single  factory  scheduling  problem,  each 
work  cell  can  be  represented  by  an  agent.  If  the  cells  do  not 
have  private  goals,  the  agents  should  act  benevolently. 

^Such  domains  are  a  superset  of  what  [Rosenschein  and 
Zlotkin,  1994]  call  Task  Oriented  Domains,  and  intersect  their 
State  Oriented  and  Worth  Oriented  Domains. 


to  precisely  compute  the  value  associated  with  accepting 
a  task.  This  computation  is  especially  hard  if  the  feasi¬ 
bility  and  cost  of  handling  a  task  depend  on  what  other 
tasks  an  agent  has.  These  problems  are  exacerbated  by 
the  uncertainty  of  an  open  environment  in  which  new 
agents  and  new  tasks  arrive  -  thus  previous  decisions 
may  be  suboptimal  in  light  of  new  information. 

The  original  contract  net  protocol  (CNP)  [Smith, 
1980]  did  not  explicitly  deal  with  these  issues,  which 
we  think  must  be  taken  into  account  if  agents  axe  to 
operate  effectively  in  a  wide  range  of  automated  ne¬ 
gotiation  domains.  A  first  step  towards  extending  the 
CNP  to  deal  with  these  issues  was  the  work  on  TRA- 
CONET  [Sandholm,  1993].  It  provided  a  formal  model 
for  bounded  rational  (BR)  self-interested  agents  to  make 
announcing,  bidding  and  awarding  decisions.  It  used  a 
simple  static  approximation  scheme  for  marginal  cost? 
calculation  to  make  these  decisions.  The  choice  of  a 
contractee  is  based  solely  on  these  marginal  cost  esti¬ 
mates.  The  monetary  payment  mechanism  allows  quan¬ 
titative  tradeoffs  between  alternatives  in  an  agent’s  nego¬ 
tiation  strategy.  Within  DAI,  bounded  rationality  (ap¬ 
proximate  processing)  has  been  studied  with  cooperative 
agents,  but  among  SI  agents,  perfect  rationality  has  been 
widely  assumed,  e.g.  [Rosenschein  and  Zlotkin,  1994; 
Ephrati  and  Rosenschein,  1991;  Kraus  et  al.y  1992]. 
We  argue  that  in  most  real  multiagent  applications, 
resource-bounded  computation  wiU  be  an  issue,  and  that 
bounded  rationality  has  profound  implications  on  both 
negotiation  protocols  and  strategies. 

Although  the  work  on  TRACONET  was  a  first  step  to¬ 
wards  this  end,  it  is  necessary — as  discussed  in  the  body 
of  this  paper — to  extend  in  significant  ways  the  CNP  in 
order  for  bounded  rational  self-interested  (BRSI)  agents 
to  deal  intelligently  with  uncertainty  present  in  the  ne¬ 
gotiation  process.  This  new  protocol  represents  a  family 
of  different  protocols  in  which  agents  can  choose  differ¬ 
ent  options  depending  on  both  the  static  and  dynamic 
context  of  the  negotiation.  The  first  option  we  will  dis¬ 
cuss  regards  commitment.  We  present  ways  of  varying 
the  stage  of  commitment,  and  more  importantly,  how  to 
implement  varying  levels  of  commitment  that  allow  more 
flexible  local  deliberation  and  a  wider  variety  of  negoti¬ 
ation  risk  management  techniques  by  allowing  agents  to 
back  out  of  contracts.  The  second  option  concerns  local 
deliberation.  Tradeoffs  are  presented  between  negotia¬ 
tion  risks  and  computation  costs,  and  an  approximation 
scheme  for  marginal  cost  calculation  is  suggested  that 
dynamically  adapts  to  an  agent’s  negotiation  state.  The 
third  set  of  options  has  to  do  with  avoiding  local  optima 
in  the  task  allocation  space  by  linking  negotiation  items 
and  by  contracts  involving  multiple  agents.  The  fourth 
set  of  options  concerns  message  congestion  management. 
We  present  these  choices  in  terms  of  a  new  protocol  for 
negotiation  among  BRSI  agents,  that,  to  our  knowledge, 
subsumes  the  CNP  and  most — ^if  not  all — of  its  exten¬ 
sions. 


^The  marginal  cost  of  adding  a  set  of  tasks  to  an  agent’s 
solution  is  the  cost  of  the  agent’s  solution  with  the  new  task 
set  minus  the  cost  of  the  agent’s  solution  without  it. 


2  Commitment  in  negotiation  protocols 

2.1  Alternative  commitment  stages 

In  mutual  negotiations,  commitment  means  that  one 
agent  binds  itself  to  a  potential  contract  while  waiting  for 
the  other  agent  to  either  accept  or  reject  its  offer.  If  the 
other  party  accepts,  both  parties  are  bound  to  the  con¬ 
tract.  When  accepting,  the  second  party  is  sure  that  the 
contract  will  be  made,  but  the  first  party  has  to  commit 
before  it  is  sure.  Commitment  has  to  take  place  at  some 
stage  for  contracts  to  take  place,  but  the  choice  of  this 
stage  can  be  varied.  TRACONET  was  designed  so  that 
commitment  took  place  in  the  bidding  phase  as  is  usual 
in  the  real  world:  if  a  task  is  awarded  to  him,  the  bid¬ 
der  has  to  take  care  of  it  at  the  price  mentioned  in  the 
bid.  Shorter  protocols  (commitment  at  the  announce¬ 
ment  phase^)  can  be  constructed  as  well  as  arbitrarily 
long  ones  (commitment  at  the  awarding  phase  or  some 
later  stage). 

The  choice  of  commitment  stage  can  be  a  static  proto¬ 
col  design  decision  or  the  agents  can  decide  on  it  dynami¬ 
cally.  For  example,  the  focused  addressing  scheme  of  the 
CNP  was  implemented  so  that  in  low  utilization  situa¬ 
tions,  contractors  announced  tasks,  but  in  high  utiliza¬ 
tion  mode,  potential  contractees  signaled  availability — 
i.e.  bid  without  receiving  announcements  first  [Smith, 
1980;  Van  Dyke  Parunak,  1987].  So,  the  choice  of  a  pro¬ 
tocol  was  based  on  characteristics  of  the  environment. 
Alternatively,  the  choice  can  be  made  for  each  nego¬ 
tiation  separately  before  that  negotiation  begins.  We 
advocate  a  more  refined  alternative,  where  agents  dy¬ 
namically  choose  the  stage  of  commitment  of  a  certain 
negotiation  during  that  negotiation.  This  allows  any  of 
the  above  alternatives,  but  makes  the  stage  of  commit¬ 
ment  a  negotiation  strategy  decision,  not  a  protocol  de¬ 
sign  decision.  The  offered  commitments  are  specified  in 
contractor  messages  and  contractee  messages,  Fig.  1. 

2.2  Levels  of  commitment 

In  traditional  multiagent  negotiation  protocols  among 
SI  agents,  once  a  contract  is  made,  it  is  binding,  i.e. 
neither  party  can  back  out.  In  cooperative  distributed 
problem  solving  (GDPS),  commitments  are  often  allowed 
to  be  broken  unilaterally  based  on  some  local  reasoning 
that  attempts  to  incorporate  the  perspective  of  common 
good  [Decker  and  Lesser,  1995].  A  more  general  alter¬ 
native  is  to  use  protocols  with  continuous  levels  of  com¬ 
mitment  based  on  a  monetary  penalty  method,  where 
commitments  vary  from  unbreakable  to  breakable  as  a 
continuum  by  assigning  a  commitment  breaking  cost  to 
each  commitment  separately.  This  cost  can  also  increase 
with  time,  decrease  as  a  function  of  acceptance  time  of 
the  offer,  or  be  conditioned  on  events  in  other  negotia¬ 
tions  or  the  environment.  Using  the  suggested  message 
types,  the  level  of  commitment  can  also  be  dynamically 
negotiated  over  on  a  per  contract  or  per  task  set  basis. 


^With  announcement  phase  commitment,  a  task  set  can 
be  announced  to  only  one  potential  bidder  at  a  time,  since 
the  same  task  set  cannot  be  exclusively  awarded  to  many 
agents. 


Among  other  things,  the  use  of  multiple  levels  of  com¬ 
mitment  allows: 

•  a  low  commitment  search  focus  to  be  moved  around  in 
the  global  task  allocation  space  (because  decommitting 
is  not  unreasonably  expensive),  so  that  more  of  that 
space  can  be  explored  among  SI  agents  which  would 
otherwise  avoid  risky  commitments^, 

•  flexibility  to  the  agent’s  local  deliberation  control,  be¬ 
cause  marginal  cost  calculation  of  a  contract  can  go  on 
even  after  that  contract  has  already  been  agreed  upon, 

•  an  agent  to  make  the  same  low- commitment  offer  (or 
offers  that  overlap  in  task  sets)  to  multiple  agents.  In 
case  more  than  one  accepts,  the  agent  has  to  pay  the 
penalty  to  all  but  one  of  them,  but  the  speedup  of  being 
able  to  address  multiple  agents  in  committal  mode  may 
outweigh  this  risk, 

•  the  agents  with  a  lesser  risk  aversion  to  carry  a  greater 
portion  of  the  risk.  The  more  risk  averse  agent  can  trade 
off  paying  a  higher  price  to  its  contractee  (or  get  paid  a 
lower  price  as  a  contractee)  for  being  allowed  to  have  a 
lower  decommitting  penalty,  and 

•  contingency  contracts  by  conditioning  the  payments  and 
commitment  functions  on  future  negotiation  events  or 
domain  events.  These  enlarge  the  set  of  mutually  bene¬ 
ficial  contracts,  when  agents  have  different  expectations 
of  future  events  or  different  risk  attitudes  [Raiffa,  1982]. 

The  advantages  of  such  a  leveled  commitment  protocol 
are  formally  analyzed  in  [Sandholm  and  Lesser,  1995a], 
and  are  now  reviewed.  Because  the  decommitment 
penalties  can  be  set  arbitrarily  high  for  both  agents, 
the  leveled  commitment  protocol  can  always  emulate  the 
full  commitment  protocol.  Furthermore,  there  are  cases 
where  there  is  no  full  commitment  contract  among  two 
agents  that  fulfills  the  participation  constraints  (agent 
prefers  to  agree  to  the  contract  as  opposed  to  passing)  for 
both  agents,  but  where  a  leveled  commitment  contract 
does  fulfill  these  constraints.  This  occurs  even  among 
risk  neutral  agents,  for  example  when  uncertainty  pre¬ 
vails  regarding  both  agents’  future  offers  received,  and 
both  agents  are  assigned  a  (not  too  high  or  low,  and  not 
necessarily  identical)  decommitment  penalty  in  the  con¬ 
tract.  Among  risk  neutral  agents,  this  does  not  occur  if 
only  one  of  the  agents  is  allowed  the  possibility  to  decom¬ 
mit  (other  agent’s  decommitment  penalty  is  too  high), 
or  only  one  agent’s  future  is  uncertain.  If  the  agents 
have  biased  information  regarding  the  future,  they  may 
perceive  that  such  a  contract  with  a  one-sided  decom¬ 
mitment  possibility  is  viable  although  a  full  commitment 
contract  is  not.  In  such  cases,  the  agent  whose  informa¬ 
tion  is  biased  is  likely  to  take  the  associated  loss  while 
the  agent  with  unbiased  information  is  not. 

Figure  1  describes  the  message  formats  of  the  new  con¬ 
tracting  protocol.  A  negotiation  can  start  with  either  a 


CONTRACTOR  MESSAGE: 

0.  Negotiation  identifier 

1.  Message  identifier 

2.  In-response-to  (message  id) 

3.  Sender 

4.  Receiver 

5.  Terminate  negotiation 

6.  Alternative  1 

6.1.  Time  valid  through 

6.2.  Bind  after  partner’s  decommit 

6.3.  Offer  submission  fee 

6.4.  Required  response  submission  fee 

6.5.  Task  set  1 

(a)  (Minimum)  specification  of  tasks 

(b)  Promised  payment  fn.  to  contractee 

(c)  Contractor’s  promised  commitment  fn. 

(d)  Contractee’s  required  commitment  fn. 

6.6.  Task  set  2 

6.i.  Task  set  i-4 

7.  Alternative  2 


j.  Alternative  j-5 


CONTRACTEE  MESSAGE: 

0.  Negotiation  identifier 

1.  Message  identifier 

2.  In-response-to  (message  id) 

3.  Sender 

4.  Receiver 

5.  Terminate  negotiation 

6.  Alternative  1 

6.1.  Time  valid  through 

6.2.  Bind  after  partner’s  decommit 


PAYMENT /DECOMMIT  MESSAGE: 
0.  Negotiation  id 

1.  Message  id 

2.  Accepted  offer  id 

3.  Acceptance  message  id 

4.  Sender 

5.  Receiver 

6.  Message  type 
(payment /decommit) 

7.  Money  transfer 


6.3.  Offer  submission  fee 

6.4.  Required  response  submission  fee 


6.5.  Task  set  1 

(a)  (Maximum)  specification  of  tasks 

(b)  Required  payment  fn.  to  contractee 

(c)  Contractor’s  required  commitment  fn. 

(d)  Contractee’s  promised  commitment  fn. 

6.6.  Task  set  2 


6.m.  Task  set  m-4 
7.  Alternative  2 


n.  Alternative  n-5 

Figure  1:  Contracting  messages  of  a  single  negotiation. 

contractor  or  a  contractee  message,  Fig.  2.  A  contrac¬ 
tor  message  specifies  exclusive  alternative  contracts  that 
the  contractor  is  willing  to  commit  to.  Within  each  al¬ 
ternative,  the  tasks  can  be  split  into  disjoint  task  sets 
by  the  sender  of  the  message  in  order  for  the  fields  (a) 
-  (d)  to  be  specific  for  each  such  task  set  -  not  neces¬ 
sarily  the  whole  set  of  tasks.  Each  alternative  has  the 
following  semantics.  If  the  contractee  agrees  to  handle 
all  the  task  sets  in  a  manner  satisfying  the  minimum  re¬ 
quired  task  descriptions  (a)  (which  specify  the  tasks  and 
constraints  on  them,  e.g.  latest  and  earliest  handling 
time  or  minimum  handling  quality),  and  the  contractee 
agrees  to  commit  to  each  task  set  with  the  level  specified 
in  field  (d),  then  the  contractor  is  automatically  commit¬ 
ted  to  paying®  the  amounts  of  fields  (b),  and  can  cancel 
the  dei  on  a  task  set  only  by  paying  the  contractee  a 
penalty  (c)*^.  Moreover,  the  contractor  is  decommitted 


®For  example,  an  agent  can  accept  a  task  set  and  later 
try  to  contract  the  tasks  in  that  set  further  separately.  With 
full  commitment,  an  agent  needs  to  have  standing  offers  from 
the  agents  it  will  contract  the  tasks  to,  or  it  has  to  be  able  to 
handle  them  itself.  With  the  variable  commitment  protocol, 
the  agent  can  accept  the  task  set  even  if  it  is  not  sure  about 
its  chances  of  getting  it  handled,  because  in  the  worst  case  it 
can  decommit. 


®  Secure  money  transfer  can  be  implemented  cryptographi¬ 
cally  e.g.  by  electronic  credit  cards  or  electronic  cash  [Kristol 
et  al.,  1994], 

^The  “Bind  after  partner’s  decommit”  (6.2)  flag  describes 
whether  an  offer  on  an  alternative  will  stay  valid  according  to 
its  original  deadline  (field  6.1)  even  in  the  case  where  the  con¬ 
tract  was  agreed  to,  but  the  partner  decommitted  by  paying 
the  decommitment  penalty. 


Contractee  proposers  / 


Contractor  proposes 


Contractee  counterproposes 
.^,^ntracior  accepts 

j  Contractor  iIccommiLs 
I  ("Bind  after  partner's  decommit" 

I  -field  (6.2)  set  in  .some  alternative 


Contractor  counterproposcs  ^  y  proposes 


Contractee  accqtj^ 

Contractee  decommits  | 
("Bind  after  partner's  decommit"  j 
-field  (6,2)  set  in  some  altemativei 


Contractor  makes 

/  J  \  Contractee  handles 

partial  payment  [_ 

f  .  \ _ 1  some  tasks  of 

by  sending  a 

/ 

1  contract 

payment  message 

Contract  / 

1  Real  world  law 

completed  / 

\  enforcement  request 

irkiitcidp.  nf  npontinlirtn  niDtncnlt 

Contractor  decommits 
("Bind  after  partner's 
decommit"-field  (6.2)  not  set 
in  contraciee's  latest  proposal) 
1  Contractor  terminates 


Contractor  misses  deadline 
of  list  allemativc  in 
contraciee's  olTcr 


Contractee  decommits 
("Bind  alter  partner’s 
decommit"-rield  (6.2)  not  set  I 
in  contractor’s  latest  proposal)] 

Contractee  terminates! 


New  negotiation  over 
same  issues  and  between 
same  agents  still  possible 


Contractee  misses  deadline 
of  last  alternative  in 
contractor's  offer 


Figure  2:  State  transition  diagram  of  a  single  negotia¬ 
tion. 


from  all  the  other  alternatives  it  suggested®.  If  the  con¬ 
tractee  does  not  accept  any  of  the  alternatives,  the  con¬ 
tractor  is  decommitted  from  ail  of  them.  Fields  (b),  (c) 
and  (d)  can  be  functions  of  time,  of  negotiation  events, 
or  of  domain  events,  and  these  times/events  have  to  be 
observable  or  verifiable  by  both  the  contractor  and  the 
contractee.  A  contractee  can  accept  one  of  the  alter¬ 
natives  of  a  contractor  message  by  sending  a  contractee 
message  that  has  task  specifications  that  meet  the  mini¬ 
mal  requirements  (a),  and  payment  functions  that  meet 
the  required  payment  functions  (b),  and  commitment 
functions  (c)  for  the  contractee  that  meet  the  required 
commitment  functions,  and  commitment  functions  (d) 
for  the  contractor  that  do  not  exceed  the  contractor’s 
promised  commitment.  A  contractor  message  can  accept 
one  of  the  alternatives  of  a  contractee  message  analo¬ 
gously.  An  agent  can  entirely  terminate  a  negotiation 
by  sending  a  message  with  that  negotiation’s  identifier 
(field  0),  and  the  terminat e-flag  (field  5)  set. 

Alternatively,  the  contractee  can  send  a  contractee 
message  that  neither  accepts  the  contractor  message  (i.e. 
does  not  satisfy  the  requirements)  nor  terminates  the  ne¬ 
gotiation.  Such  a  message  is  a  counterproposal,  which 
the  contractor  then  can  accept,  terminate  the  negotia¬ 
tion,  or  further  counterpropose  etc.  ad  infinitum  The 
CNP  did  not  allow  counterproposing:  an  agent  could  bid 
to  an  announcement  or  decide  not  to  bid.  A  contrac- 


*  Another  protocol  would  have  offers  stay  valid  according 
to  their  original  specification  (deadline)  no  matter  whether 
the  partner  accepts,  rejects,  coimterproposes,  or  does  none 
of  these.  We  do  not  use  such  protocols  due  to  the  harmfully 
(Sec.  3)  growing  number  of  pending  commitments. 

®An  agent  that  has  just  (coimter)proposed  can  counter¬ 
propose  again  (dotted  lines  in  Fig.  2).  This  allows  it  to 
add  new  offers  (that  share  the  “In-response- to”-field  with  the 
pending  ones),  but  does  not  allow  retraction  of  old  offers.  Re¬ 
traction  is  problematic  in  a  distributed  system,  because  the 
negotiation  partner’s  acceptance  message  may  be  on  the  way 
while  the  agent  sends  the  retraction. 


tor  had  the  option  to  award  or  not  to  award  the  tasks 
according  to  the  bids.  Counterproposing  among  coop¬ 
erative  agents  was  studied  in  [Moehlman  et  al.,  1992; 
Sen,  1993].  Our  counterproposing  mechanism  is  one  way 
of  overcoming  the  problem  of  lacking  truthful  abstrac¬ 
tions  of  the  global  search  space  (defined  by  the  task  sets 
and  resource  sets  of  aU  the  agents)  in  negotiation  systems 
consisting  of  SI  agents. 

There  are  no  uncommittal  messages  such  as  announce¬ 
ments  used  to  declare  tasks:  all  messages  have  some  com¬ 
mitment  specification  for  the  sender.  In  early  messages 
in  a  negotiation,  these  commitment  specifications  can 
be  too  low  for  the  pcirtner  to  accept,  and  counterpropos¬ 
ing  occurs.  Thus,  the  level  and  stage  of  commitment 
are  dynamically  negotiated  along  with  the  negotiation 
of  taking  care  of  tasks. 

The  presented  negotiation  protocol  is  a  strict  gener¬ 
alization  of  the  CNP,  and  can  thus  always  emulate  it. 
Moreover,  there  are  cases  where  this  protocol  is  better 
than  the  CNP — due  to  reasons  listed  earlier.  Yet,  the  de¬ 
velopment  of  appropriate  negotiation  strategies  for  this 
protocol  is  challenging — e.g.  how  should  an  agent  choose 
commitment  functions  and  payment  functions? 

2.3  Decommitting:  replies  vs.  timeouts 

The  (6.1)  field  describes  how  long  an  offer  on  an  al¬ 
ternative  is  valid.  If  the  negotiation  partner  has  not 
answered  by  that  time,  the  sender  of  the  message  gets 
decommitted  from  that  alternative.  An  alternative  to 
these  strict  deadlines  is  to  send  messages  that  have  the 
(b)  field  be  a  function  of  the  time  of  response  (simi¬ 
larly  for  (c)  and  (d)  fields).  This  allows  a  contractor 
to  describe  a  payment  that  decreases  as  the  acceptance 
of  the  contractor  message  is  postponed.  Similarly,  it 
allows  a  contractee  to  specify  required  payments  that 
increase  as  the  acceptance  of  the  contractee  message  is 
postponed.  This  motivates  the  negotiation  partner  to  re¬ 
spond  quickly,  but  does  not  force  a  strict  deadline,  which 
can  inefficiently  constrain  that  agent’s  local  deliberation 
scheduling.  Both  the  strict  deadline  mechanism  and  this 
time-dependent  payment  scheme  require  that  the  send¬ 
ing  or  receival  time  of  a  message  can  be  verified  by  both 
parties. 

An  alternative  to  automatic  decommitment  by  the 
deadline  is  to  have  the  negotiation  partner  send  a  neg¬ 
ative  reply  (negotiation  termination  message)  by  the 
deadline.  These  forced  response  messages  are  not  viable 
among  SI  agents,  because  an  agent  that  has  decided  not 
to  accept  or  counterprop ose  has  no  reason  to  send  a  re¬ 
ply.  Sending  reply  messages  also  in  negative  cases  allows 
the  offering  agent  to  decommit  before  the  validity  time 
of  its  offer  ends.  This  frees  that  agent  from  consider¬ 
ing  the  effects  of  the  possible  acceptance  of  that  offer  on 
the  marginal  costs  of  other  task  sets  that  the  agent  is 
negotiating  over.  This  saved  computation  can  be  used 
to  negotiate  faster  on  other  contracts.  Thus,  an  agent 
considering  sending  a  negative  reply  may  want  to  send 
it  in  cases  where  the  offering  agent  is  mostly  negotiat¬ 
ing  with  that  agent,  but  not  in  cases,  where  the  offering 
agent  is  that  agent’s  competing  offerer  in  most  other  ne¬ 
gotiations. 


3  Implications  of  bounded  rationality 

Interactions  of  SI  agents  have  been  widely  studied  in  mi¬ 
croeconomics  [Kreps,  1990;  Varian,  1992;  Raiifa,  1982] 
and  DAI  [Rosenschein  and  Zlotkin,  1994;  Ephrati  and 
Rosenschein,  1991;  Kraus  et  al.^  1992;  Durfee  et  aZ., 
1993],  but  perfect  rationality  of  the  agents  has  usu¬ 
ally  been  assumed:  flawless  deduction,  optimal  reason¬ 
ing  about  future  contingencies  and  recursive  modeling 
of  other  agents.  Perfect  rationality  implies  that  agents 
can  compute  their  marginal  costs  for  tasks  exactly  and 
immediately,  which  is  untrue  in  most  practical  situa¬ 
tions.  An  agent  is  bounded  rational,  because  its  com¬ 
putation  resources  are  costly,  or  they  are  bounded  and 
the  environment  keeps  changing — e.g.  new  tasks  arrive 
and  there  is  a  bounded  amount  of  time  before  each  part 
of  the  solution  is  used  [Garvey  and  Lesser,  1994;  Sand- 
holm  and  Lesser,  1994;  Zilberstein,  1993;  Simon,  1982; 
Good,  1971].  Contracting  agents  have  the  following  ad¬ 
ditional  real-time  pressures: 

•  A  counteroffer  or  an  acceptance  message  has  to  be  sent 
by  a  deadline  (field  6.1)  -  otherwise  the  negotiation  ter¬ 
minates,  Fig.  2.  If  the  negotiation  terminates,  the  agent 
can  begin  a  new  negotiation  on  the  same  issues,  but  it 
will  not  have  the  other  agent’s  commitment  at  first. 

t  Sending  an  outgoing  offer  too  late  may  cause  the  receiv¬ 
ing  agent  to  make  a  contract  on  some  of  the  same  tasks 
with  some  other  agent  who  negotiated  earlier — thus  dis¬ 
abling  this  contract  even  if  the  offer  makes  the  dead¬ 
line.  In  case  this  deadline  abiding  offer  is  an  acceptance 
message — as  opposed  to  a  counteroffer — the  partner  has 
to  pay  the  decommitment  penalty  that  it  had  declared. 

•  The  (b)-(d)  fields  can  be  functions  of  response  time, 
Fig.  1.  An  agent  may  get  paid  less  for  handling  tasks 
(or  pay  more  for  having  tasks  handled)  or  be  required  to 
commit  more  strongly  or  receive  a  weaker  commitment 
from  the  negotiation  partner  if  its  response  is  postponed. 

•  The  agent’s  cost  of  breaking  commitments  (after  a  con¬ 
tract  is  made)  may  increase  with  time. 

This  problem  setup  leads  to  a  host  of  local  delibera¬ 
tion  scheduling  issues.  An  agent  has  to  decide  how  much 
computation  it  should  allocate  to  refine  its  marginal  cost 
estimate  of  a  certain  task  set.  With  a  bounded  CPU,  if 
too  much  time  is  allocated,  another  agent  may  win  the 
contract  before  the  reply  is  sent,  or  not  enough  time  re¬ 
mains  for  refining  marginal  costs  of  other  task  sets.  If 
too  little  time  is  allocated,  the  agent  may  make  an  un- 
beneficial  contract  concerning  that  task  set.  If  multiple 
negotiations  are  allowed  simultaneously,  the  agent  has 
to  decide  on  which  sets  of  tasks  (offered  to  it  or  poten¬ 
tially  offered  by  it)  its  bounded  computation  should  be 
focused — and  in  what  order.  It  may  want  to  ignore  some 
of  its  contracting  possibilities  in  order  to  focus  more  de¬ 
liberation  time  to  compute  marginal  costs  for  task  sets  of 
some  selected  potential  contracts.  So,  there  is  a  tradeoff 
of  getting  more  exact  marginal  cost  estimates  and  being 
able  to  engage  in  a  larger  number  of  negotiations. 

The  CNP  did  not  consider  an  agent’s  risk  attitude  to¬ 
ward  being  committed  to  activities  it  may  not  be  able  to 
honor,  or  the  honoring  of  which  may  turn  out  unbenefi- 
cial.  In  our  protocol,  an  agent  can  take  a  risk  by  making 
offers  while  the  acceptance  of  earlier  offers  is  pending. 


Contracting  during  pending  commitments  speeds  up  the 
negotiations  because  an  agent  does  not  have  to  wait  for 
results  on  earlier  commitments  before  carrying  on  with 
other  negotiations.  The  work  on  TRACONET  formal¬ 
ized  the  questions  of  risk  attitude  in  a  3-stage  (announce- 
bid-award)  full- commitment  protocol,  and  chose  a  risk 
taking  strategy  where  each  agent  ignored  the  chances  of 
pending  commitments  being  accepted  in  order  to  avoid 
computations  regarding  these  alternative  future  worlds. 
This  choice  was  static,  but  more  advanced  agents  should 
use  a  risk  taking  strategy  where  negotiation  risk  is  explic¬ 
itly  traded  off  against  added  computation  regarding  the 
marginal  cost  of  the  task  set  in  the  alternative  worlds, 
where  different  combinations  of  sent  pending  offers  are 
accepted. 

There  is  a  tradeoff  between  accepting  or  (counter) 
proposing  early  on  and  waiting: 

•  A  better  offer  may  be  received  later. 

•  Waiting  for  more  simultaneously  valid  offers  enables  an 
agent  to  identify  and  accept  synergic  ones:  having  more 
options  available  at  the  decision  point  enables  an  agent 
to  make  more  informed  decisions. 

•  Accepting  early  on  simplifies  costly  marginal  cost  com¬ 
putations,  because  there  are  fewer  options  to  consider. 
An  option  corresponds  to  an  item  in  the  power  set  of 
offers  that  an  agent  can  accept  or  make. 

t  By  waiting  an  agent  may  miss  opportunities  due  to  oth¬ 
ers  making  related  contracts  first. 

An  agent  should  anticipate  future  negotiation  and 
domain  events  in  its  strategy  [Sandholm  and  Lesser, 
1995b]. It  suffices  to  take  these  events  into  account  in 
marginal  cost  estimation:  this  will  cause  the  agent  to  an¬ 
ticipate  with  its  domain  solution.  The  real  marginal  cost 
of  a  task  set  is  the  difference  in  the  streams  of  payments 
and  domain  costs  when  an  agent  has  the  task  set  and 
when  the  agent  does  not  have  it.  This  marginal  cost  does 
not  necessarily  equal  the  cost  that  is  acquired  statically 
at  contract  time  (before  the  realization  of  unknown  fu¬ 
ture  negotiation  events  and  domain  events)  by  taking  the 
difference  of  the  cost  of  the  agent’s  optimal  solution  with 
the  task  set  and  the  optimal  solution  without  it.  Fur¬ 
thermore,  for  BR  agents,  the  marginal  cost  may  change 
as  more  computation  is  allocated  to  the  solution  includ¬ 
ing  the  task  set  or  the  solution  without  it.  In  general,  the 
marginal  cost  of  a  task  set  depends  on  which  other  tasks 
the  agent  has.  Therefore,  theoretically,  the  marginal  cost 
of  a  task  set  has  to  be  computed  in  aU  of  the  alternative 
future  worlds,  where  different  combinations  of  pending, 

^°The  agent  can  believe  that  domain  events  occur  to  the 
agent  society  according  to  some  distribution  eind  that  in 
steady  state  these  events  wiU  affect  (directly  or  by  negoti¬ 
ation)  the  agent  according  to  some  distribution.  E.g.  the 
agent  assumes  that  future  tasks  end  up  in  its  task  set  ac¬ 
cording  to  a  distribution.  On  another  level,  an  agent  can 
try  to  outguess  the  other  agents’  solutions  so  that  it  can  use 
the  others  marginal  costs  as  a  basis  for  its  own  marginal  cost 
calculation.  On  a  third  level,  the  agent  can  model  what  an¬ 
other  agent  is  guessing  about  yet  another  agent,  and  so  on  ad 
infinitum.  There  is  a  tradeoff  between  allocating  costly  com¬ 
putation  resources  to  such  recursive  modeling  and  gaining 
domain  advantage  by  enhanced  anticipation. 


to-be-sent,  and  to-be-received  offers  have  been  accepted, 
different  combinations  of  old  and  to-occur  contracts  have 
been  broken  by  decommitting  (by  the  agent  or  its  part¬ 
ners),  and  different  combinations  of  domain  events  have 
occurred.  Managing  such  contingencies  formally  using 
probability  theory  is  intractable:  costs  of  such  computa¬ 
tions  should  be  explicitly  traded  off  against  the  domain 
advantage  they  provide.  An  agent  can  safely  ignore  the 
chances  of  other  agents  decommitting  only  if  the  decom¬ 
mitment  penalties  are  high  enough  to  surely  compensate 
for  the  agent’s  potential  loss.  Similarly,  an  agent  has  to 
ignore  its  decommitting  possibilities  if  its  penalties  cire 
too  high.  The  exponential  number  of  alternative  worlds 
induced  by  decommitting  options  sometimes  increases 
computational  complexity  more  than  the  benefit  from 
the  gradual  commitment  scheme  warrants.  Moreover, 
the  decommitting  events  are  not  independent:  chains  of 
decommitting  complicate  the  management  of  decommit¬ 
ment  probabilities.  Thus,  decommitment  penalty  func¬ 
tions  that  increase  rapidly  in  time  may  often  be  appro¬ 
priate  for  BR  agents. 

Because  new  events  are  constantly  occurring,  the  de¬ 
liberation  control  problem  is  stochastic.  An  agent  should 
take  the  likelihood  of  these  events  into  account  in  its  de¬ 
liberation  scheduling.  The  performance  profile  of  the  lo¬ 
cal  problem  solving  algorithm  should  be  conditioned  on 
features  of  the  problem  instance  [Sandholm  and  Lesser, 
1994],  on  performance  on  that  instance  so  far  [Sandholm 
and  Lesser,  1994;  Zilberstein,  1993],  and  on  performance 
profiles  of  closely  related  optimizations  (related  calcula¬ 
tions  of  marginal  costs).  These  aspects  make  exact  de¬ 
cision  theoretic  deliberation  control  infeasible:  approx¬ 
imations  are  required.  The  need  for  this  type  of  de¬ 
liberation  control  has  not,  to  our  knowledge,  been  weU 
understood,  and  analytically  developing  a  domain  inde¬ 
pendent  control  strategy  that  is  instantiated  separately 
(using  statistical  methods)  for  each  domain  would  allow 
faster  development  of  more  efficient  automated  negotia¬ 
tors  across  multiple  domains. 

4  Linking  negotiation  items 

In  early  CNP  implementations,  tasks  were  negotiated 
one  at  a  time.  This  is  insufficient,  if  the  cost  or  fea¬ 
sibility  of  carrying  out  a  task  depend  on  the  carrying 
out  of  other  tasks:  there  may  be  local  optima,  where  no 
transfer  of  a  single  task  between  agents  enhances  the 
global  solution,  but  transferring  a  larger  set  of  tasks 
simultaneously  does.  The  need  for  larger  transfers  is 
weU  known  in  centralized  iterative  refinement  optimiza¬ 
tion  [Lin  and  Kernighan,  1971;  Waters,  1987],  but  has 
been  generally  ignored  in  automated  negotiation.  TRA- 
CONET  extended  the  CNP  to  handle  task  interactions 
by  having  the  announcer  cluster  tasks  into  sets  to  be  ne¬ 
gotiated  atomically.  Alternatively,  the  bidder  could  have 
done  the  clustering  by  counterproposing.  Our  protocol 
generalizes  this  by  allowing  either  party  to  do  the  clus¬ 
tering,  Fig.  1,  at  any  stage  of  the  protocol. 

The  equivalent  of  large  transfers  can  be  accomplished 
by  smaller  ones  if  the  agents  are  willing  to  take  risks. 
Even  if  no  small  contract  is  individually  beneficial,  the 
agents  can  sequentially  make  all  the  small  contracts  that 


sum  up  to  a  large  beneficial  one.  Early  in  this  sequence, 
the  global  solution  degrades  until  the  later  contracts  en¬ 
hance  it.  When  making  the  early  commitments,  at  least 
one  of  the  two  agents  has  to  risk  taking  a  permanent  loss 
in  case  the  partner  does  not  agree  to  the  later  contracts. 
Our  protocol  decreases  such  risks  as  much  as  preferred 
by  allowing  breaking  commitments  by  paying  a  penalty. 
The  penalty  function  may  be  explicitly  conditioned  on 
the  acceptance  of  the  future  contracts,  or  it  may  specify 
low  commitment  for  a  short  time  during  which  the  agent 
expects  to  make  the  remaining  contracts  of  the  sequence. 

Sometimes  there  is  no  task  set  size  such  that  trans¬ 
ferring  such  a  set  from  one  agent  to  another  enhances 
the  global  solution.  Yet,  there  may  be  a  beneficial  swap 
of  tasks,  where  the  first  agent  subcontracts  some  tasks 
to  the  second  and  the  second  subcontracts  some  to  the 
first.  Swaps  can  be  explicitly  implemented  in  a  negotia¬ 
tion  protocol  by  allowing  some  task  sets  in  an  alternative 
(Fig.  1)  to  specify  tasks  to  contract  in  and  some  to  spec¬ 
ify  tasks  to  contract  out.  In  the  task  sets  added  to  imple¬ 
ment  swaps,  “Minimum”  in  field  (a)  should  be  changed 
to  “Maximum”  and  vice  versa.  In  field  (b),  “Promised 
payment  fn.  to  contractee”  should  be  changed  to  “Re¬ 
quired  payment  fn.  from  contractee”  and  “Required  pay¬ 
ment  fn.  to  contractee”  should  be  changed  to  “Promised 
payment  fn.  from  contractee”.  Alternatively,  in  proto¬ 
cols  that  do  not  explicitly  incorporate  swaps,  they  can  be 
made  by  agents  taking  risks  and  constructing  the  swap 
as  a  sequence  of  one  way  task  transfer  contracts.  Here 
too,  the  decommitment  penalty  functions  can  be  condi¬ 
tioned  on  later  contracts  in  the  sequence  or  on  time  to 
reduce  (or  remove)  risk. 

5  Mutual  vs.  multiagent  contracts 

Negotiations  may  have  reached  a  local  optimum  with 
respect  to  each  agent’s  local  search  operators  and  mutual 
contract  operators  (transfers  and  swaps  of  any  size),  but 
solution  enhancements  would  be  possible  if  tasks  were 
transferred  among  more  than  two  agents,  e.g.  agent  A 
subcontracts  a  task  to  C  and  B  subcontracts  a  task  to 
C.  There  are  two  main  ways  to  implement  such  deals^^: 

1.  Explicit  multiagent  contracts.  These  contract 
operators  can  be  viewed  as  atomic  operators  in  the  global 
task  allocation  space.  First,  one  agent  (with  an  incom¬ 
plete  view  of  the  other  agents’  tasks  and  resources)  has 
to  identify  the  beneficiality  of  a  potential  multiagent  con¬ 
tract.  Alternatively,  the  identification  phase  can  be  im¬ 
plemented  in  a  distributed  manner.  Second,  the  proto¬ 
col  has  to  allow  a  multiagent  contract.  This  can  be  done 
e.g.  by  circulating  the  contract  message  among  the  par¬ 
ties  and  agreeing  that  the  contract  becomes  valid  only  if 
every  agent  signs. 

2.  Multiagent  contracts  through  mutual  con¬ 
tracts.  A  multiagent  contract  is  equivalent  to  a  se¬ 
quence  of  mutual  contracts.  In  cases  where  a  local  opti¬ 
mum  with  respect  to  mutual  contracts  has  been  reached, 

^^Sathi  et  al.  [Sathi  and  Fox,  1989]  did  this  by  having  a 
centralized  mediator  cluster  several  announcements  and  bids 
from  multiple  agents  into  atomic  contracts.  That  is  unrea¬ 
sonable  if  decentralization  is  desired. 


the  first  mutual  contracts  in  the  sequence  will  incur 
losses.  Thus,  one  or  more  agents  have  to  incur  risk  in 
initially  taking  unbeneficial  contracts  in  unsure  anticipa¬ 
tion  of  more  than  compensatory  future  contracts.  Our 
protocol  provides  mechanisms  for  decreasing  this  risk, 
either  by  conditioning  the  decommitment  penalty  func¬ 
tions  on  whether  the  contracts  with  other  agents  take 
place,  or  by  choosing  the  penalties  to  be  low  early  on 
and  increase  with  time.  In  the  limit,  the  penalty  is  zero 
(theoretically  possibly  even  negative)  for  all  contracts  in 
the  sequence  if  some  contract  in  it  is  not  accepted.  The 
problem  with  contingency  contracts  is  just  the  monitor¬ 
ing  of  the  events  that  the  contract  (penalty)  is  contin¬ 
gent  on:  how  can  the  contract ee  monitor  the  contractor’s 
events  and  vice  versa? 

Sometimes  an  agent  can  commit  to  an  unprofitable 
early  contract  in  the  sequence  without  risk  even  with 
constant  high  decommitting  penalties.  E.g.  if  an  agent 
has  received  committal  offers  on  two  contracts,  it  can 
accept  both  without  risk — assuming  that  decommitment 
penalties  for  the  two  senders  are  so  high  that  they  will 
not  decommit.  Even  though  the  agent  may  have  some 
offers  committed  simultaneously,  the  hkelihood  of  hav¬ 
ing  all  the  necessary  offers  committed  simultaneously  de¬ 
creases  as  the  number  of  mutual  contracts  required  in 
the  multiagent  contract  increases.  Sometimes  there  is 
a  loop  of  agents  in  the  sequence  of  mutual  contracts, 
e.g.  say  that  the  only  profitable  operator  is  the  follow¬ 
ing:  agent  A  gives  task  1  to  agent  B,  agent  B  gives  task 
2  to  agent  C,  and  agent  C  gives  task  3  to  agent  A.  In 
such  cases  it  is  impossible  to  handle  the  multiagent  con¬ 
tract  as  separate  mutual  contracts  without  risk  (without 
tailoring  the  decommitment  penalty  functions).  A  nego¬ 
tiating  agent  should  take  the  possibilities  of  such  loops 
into  account  when  estimating  the  probabilities  of  receiv¬ 
ing  certain  tasks,  because  the  very  offering  or  accepting 
of  a  certain  task  may  directly  affect  the  likelihood  of 
getting  offers  or  acceptances  for  other  tasks. 

6  Message  congestion:  Tragedy  of  the 
commons 

Most  distributed  implementations  of  automated  con¬ 
tracting  have  run  into  message  congestion  prob¬ 
lems  [Smith,  1980;  Van  Dyke  Parunak,  1987;  Sandholm, 
1993].  WhUe  an  agent  takes  a  long  time  to  process 
a  large  number  of  received  messages,  even  more  mes¬ 
sages  have  time  to  arrive,  and  there  is  a  high  risk 
that  the  agent  wiQ  finally  be  saturated.  Attempts  to 
solve  these  problems  include  focused  addressing  [Smith, 
1980],  audience  restrictions  [Van  Dyke  Parunak,  1987; 
Sandholm,  1993]  and  ignoring  incoming  messages  that 
are  sufficiently  outdated  [Sandholm,  1993].  Focused  ad¬ 
dressing  means  that  in  highly  constrained  situations, 
agents  with  free  resources  announce  availability,  while 
in  less  constrained  situations,  agents  with  tasks  an¬ 
nounce  tasks.  This  avoids  announcing  too  many  tasks 
in  highly  constrained  situations,  where  these  announce¬ 
ments  would  seldom  lead  to  results.  In  less  constrained 
environments,  resources  are  plentiful  compared  to  tasks, 
so  announcing  tasks  focuses  negotiations  with  fewer  mes¬ 
sages.  Audience  restrictions  mean  that  an  agent  can  only 


announce  to  a  subset  of  agents  which  are  supposedly 
most  potential. 

Focused  addressing  and  audience  restrictions  are  im¬ 
posed  on  an  agent  by  a  central  designer  of  the  agent  soci¬ 
ety.  Neither  is  viable  in  open  systems  with  SI  agents.  An 
agent  will  send  a  message  whenever  it  is  beneficial  to  it¬ 
self  even  though  this  might  saturate  other  agents.  With 
flat  rate  media  such  as  the  Internet,  an  agent  prefers 
sending  to  almost  everyone  who  has  non-zero  probabil¬ 
ity  of  accepting/count erproposing.  The  society  of  agents 
would  be  better  off  by  less  congested  communication 
links  by  restricted  sending,  but  each  agent  sends  as  long 
as  the  expected  utility  from  that  message  exceeds  the 
decrease  in  utility  to  that  agent  caused  by  the  congest¬ 
ing  effect  of  that  message  in  the  media.  This  defines 
a  tragedy  of  the  commons  [Turner,  1992;  Hardin,  1968] 
(n-player  prisoners’  dilemma).  The  tragedy  occurs  only 
for  low  commitment  messages  (usually  early  in  a  negotia¬ 
tion)  :  having  multiple  high  commitment  offers  out  simul¬ 
taneously  increases  an  agent’s  negotiation  risk  (Sec.  2.2) 
and  computation  costs  (Sec.  3). 

The  obvious  way  to  resolve  the  tragedy  is  a  use-based 
communication  charge.  Another  is  mutual  monitoring: 
an  agent  can  monitor  how  often  a  certain  other  agent 
sends  low  commitment  messages  to  it,  and  over-eager 
senders  can  be  punished.  By  mutual  monitoring,  audi¬ 
ence  restrictions  can  also  be  implemented:  if  an  agent 
receives  an  announcement  although  it  is  not  in  the  ap¬ 
propriate  audience,  it  can  directly  identify  the  sender  as 
a  violator.  Our  protocol  allows  an  agent  to  determine 
in  its  offer  (field  6.4)  a  processing  fee  that  an  accepting 
or  counterproposing  agent  has  to  submit  in  its  response 
(field  6.3)  for  the  response  to  be  processed.  This  imple¬ 
ments  a  self-selecting  dynamic  audience  restriction  that 
is  viable  among  SI  agents. 

7  Conclusions 

We  introduced  a  collection  of  issues  that  arise  in  auto¬ 
mated  negotiation  systems  consisting  of  BRSI  agents. 
Reasons  for  dynamically  chosen  commitment  stage  and 
level  were  given  and  a  protocol  that  enables  this  was  pre¬ 
sented.  The  need  for  explicit  local  deliberation  schedul¬ 
ing  was  shown  by  tradeoffs  between  computation  costs 
and  negotiation  benefits  and  risk.  Linking  negotiation 
items  and  multiagent  contracts  were  presented  as  meth¬ 
ods  to  avoid  local  optima  in  the  global  task  allocation 
space,  and  their  implementation  among  BRSI  agents  was 
discussed.  Finally,  message  congestion  mechanisms  for 
SI  agents  were  presented. 

Negotiations  among  BRSI  agents  also  involve  other 
issues  (detailed  in  [Sandholm  and  Lesser,  1995b]  due  to 
limited  space  here)  such  as:  insufficiency  of  the  Vickrey 
auction  to  promote  truth-teUing  and  stop  counterspecu¬ 
lation,  usefulness  of  long  term  strategic  contracts,  trade¬ 
offs  between  enforced  and  unenforced  contracts  [Sand¬ 
holm  and  Lesser,  1995d],  and  knowing  when  to  terminate 
the  negotiations  when  an  optimum  with  respect  to  the 
current  tasks  and  resources  has  been  reached  or  when 
further  negotiation  overhead  outweighs  the  associated 
benefits.  Coalition  formation  among  BRSI  agents  has 
been  studied  in  [Sandholm  and  Lesser,  1995c]. 
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